Skip to main content

Unfortunately we don't fully support your browser. If you have the option to, please upgrade to a newer version or use Mozilla Firefox, Microsoft Edge, Google Chrome, or Safari 14 or newer. If you are unable to, and need support, please send us your feedback.

We'd appreciate your feedback on this new experience.Tell us what you think(opens in new tab/window)

Elsevier
Publish with us
Connect

Evidence-based assessment, equity and opportunity

September 13, 2023

By Margaret Sheil, AO, FAA, FTSE

Quote from Prof Margaret Sheil, AO, Vice-Chancellor and President of QUT, Australia: “The reason metrics and other indicators have such widespread currency is not just because they provide the illusion of accuracy; it is because they are so convenient.”

When it comes to assessing academic career progress, metrics can be deceptive, especially when evaluating individuals from diverse backgrounds.

My career has transversed scientific research, university teaching, academic leadership, and research policy at three very different universities and in government as chief executive of the Australian Research Council (ARC) — the major national agency for funding non-medical research in Australian universities. I recently led a review of the ARC(opens in new tab/window) for the current government more than 10 years after I left that role. When reflecting on my career, I am regularly asked how I have simultaneously presided over meritocratic institutions or processes whilst strongly supporting initiatives designed to increase access, success and representation of underrepresented groups.

This article is from the Not Alone newsletter, a monthly publication that showcases new perspectives on global issues directly from research and academic leaders.

Merit and opportunity

Early in my career, the focus was women in science; later initiatives supported First Nations students and staff. Each of my institutions, and indeed Australian government policy(opens in new tab/window), has sought to support students who have not historically had access to higher education. The tenor of these inevitable questions presumes that merit can be objectively assessed, that equality is the same as equity, and that promoting excellence is in tension with access. The latter assumption underpins the recent US Supreme Court decision to overturn the legal basis for affirmative action programs in employment and admissions — running counter to President Johnson’s declaration(opens in new tab/window) when he introduced affirmative action in the 1960s: “We must seek not just equality as a right and a theory but equality as a fact and result … equal opportunity is essential but not enough.”

That principle has not changed: talent is broadly distributed; opportunity is not. So whether we are selecting for admission to our universities or recruiting staff, we must not start with the assumption that each has the same opportunity to develop or demonstrate their ability.

Women in research

My interest in equity started early in my career. Even by 1990, women in tenure-track roles in university science departments in Australia were rare. As I was striving to compete for research funding and publications in the top journals and establish myself as an independent scientist, I initially resented the frequent calls on my time to sit on recruitment and selection panels to provide gender input (not gender balance, as we did not have the numbers). Often the only woman in the room, I became acutely conscious of, and ultimately more outspoken about, the barriers, inequities and unconscious biases when it came to assessing the “merit” of various individuals.

When I later joined the university-wide tenure/promotion committee, I naively thought that if we presented more data and more publication metrics on candidates, these biases would not persist. However, they usually did but in different, seemingly more objective forms. I found that efforts to improve metrics often had the opposite effect. For example, substituting total numbers of publications and citations with a measure such as h-index, which considers cumulative performance over time, has exacerbated inequities for those who have had non-linear or interrupted careers. These are mostly women with caring responsibilities but also those who have taken time out to commercialise their research or spend time in an industry partnership.

My general observation is that the more seemingly precise or informative a new indicator, especially if applied as a solution, the less scrutiny there is of the quality and opportunities for individuals to whom it is applied.

Because of these challenges, I frequently heard as a young scientist that “if you take a break, you can never return.” Fortunately, we now have numerous amazing role models to counter that claim. In my own discipline of mass spectrometry, Prof Dame Carol Robinson, FRS(opens in new tab/window) is an example of someone who has had a non-traditional yet outstanding career journey and success.

Biases in academic recruitment and promotion adversely impact women and minority groups in academia, while the lack of diversity further hampers academic achievements and ultimately diminishes the potential societal impact. Apart from being the right thing to do for affected individuals, there is also ample evidence demonstrating that diverse teams perform better(opens in new tab/window) and produce better outcomes. Diverse teams in research institutions improve research methodology(opens in new tab/window) and question development. For example, there is an abundance of research and policy decisions based on data or evidence generated from men(opens in new tab/window), with findings then being extrapolated to women without considering differences between the sexes.

At the Australian Research Council, when confronted with the overall lack of success for women, we made changes to encourage reporting and equitable treatment of those who had career breaks and to encourage more women to apply. These resulted in some progress, but our recent review(opens in new tab/window) found something paradoxical: Over the ensuing 12 years, in an effort to increase fairness and objectivity, much more detail was added to the career descriptions to the point that reviewers paid less and less attention, thereby potentially reducing the impact of this important change over time.

Alongside the changes in presentation of research records, we also analysed in more detail the number of grants awarded to men and women applicants at different career stages. This yielded some unexpected results, including that the difference in overall success rates (the number of applicants awarded grants compared to those who applied) was greatest at the earliest career stages (0-5 years post-PhD). Therefore, we concluded these differences could not just be due to “women taking a career breaks or not being promoted at the same rate” but rather that men were given more sponsorship and opportunities for first authorship, presentations at conferences and more challenging or different projects (e.g., quantitative versus qualitative) that then manifested seemingly better publication metrics from the outset of their careers. In contrast, the scores for project quality were not correlated with gender of the applicant. Our conclusion was similar to that expressed in the San Franscisco Declaration on Research Assessment (DORA)(opens in new tab/window), which notesthat funding agencies should“be explicit about the criteria used in evaluating the scientific productivity of grant applicants and clearly highlight, especially for early-stage investigators, that the scientific content of a paper is much more important than publication metrics or the identity of the journal in which it was published.”

Metrics and measurement

During my time as CEO, the ARC also developed and implemented a research evaluation system for Australian universities, known as the Excellence in Research for Australia (ERA) initiative(opens in new tab/window). ERA aimed to use metrics and other evidence to inform expert assessments of the standing of individual disciplines within each institution. ERA was underpinned by detailed consultation and testing of the types of indicators that were possible proxies for quality. These ranged from traditional publication metrics in the natural and physical sciences to peer review in the humanities and social sciences, evidence of applications in professional fields (e.g., conferences or reports), and non-traditional outputs in the creative and performing arts. A challenge for a national research measurement exercise is that it is impossible to not influence what you are trying to measure, as Heisenberg demonstrated well over 100 years ago. When research metrics are applied to the consideration of individual academic performance, academics will respond to the measurement system and may change their behaviour. A detrimental effect was a focus on publications in international journals and arguably less focus on research relevant to the Australian or local context, interdisciplinary research and research with longer time horizons. On the positive side, the overall quality of Australian research improved. However, the benefits relative to the burden of the exercise have also diminished over time. Our recent review recommended to government that the effort would be better directed elsewhere. We further recommended against any attempts to replace the current evaluation exercise with metrics alone, however tempting and convenient that option may be.

Indicators here to stay?

The reason metrics and other indicators have such widespread currency, however, is not just because they provide the illusion of accuracy; it is because they are so convenient: they save time and reduce the cost of assessment. Time, as we know in the increasingly busy world of academia, is always at a premium. Undertaking detailed peer review of publications and grants or serving on promotion and selection panels is typically work done over and above other obligations in teaching, research and university or professional service. Tools that enable those roles to be done more efficiently, and hopefully more expertly, are attractive and inevitable. Similarly, tools that compare different parts of the university are seemingly attractive to busy university leaders. I use them in that way too, but I am very careful about not using them indiscriminately when applied to individuals. The irony of leaders who complain about the vagaries of the indicators used by international ranking agencies yet at the same time utilize similar methods within their institutions in funding allocations is not lost on our staff.

Looking beyond the traditional indicators

My own institution has recently committed to DORA and commenced work on the implementation of principles that align with best practice in research assessment. We recognise the inevitability and increasing attractiveness of more sophisticated search and artificial intelligence tools, each of which may introduce discrimination and biases affecting different groups and individuals, especially those with more complex experiences and careers.

Given a long-standing interest in identifying and rewarding those who will be good mentors, I asked the International Center for the Study of Research (ICSR) team at Elsevier whether this could be gleaned from publishing behaviour. For example, a good mentor may have a lot of junior co-authors and, depending on the field of research, choose to put their junior co-authors in senior positions. An individual who may be good at promoting diversity will have more diverse co-authors, and so on.

Preliminary work undertaken with ICSR was aimed at developing a dashboard that provides a more holistic picture of individuals. A focus on highlighting attributes that are not normally recognised — like mentoring, social engagement and collaboration — may assist in encouraging diversity and achieving equity in all our collective endeavours. Still, there is more work to be done with many challenges ahead, not the least of them the rise of generative AI and its impact.

If indicators are here to stay, we need to increase awareness and education of those who use them, identify better ways to account for the complexity of diverse career trajectories and the quality and impact measures relevant to different disciplines and cohorts of staff — and be mindful of the biases inherent in any data set. The Hippocratic Oath as it applies to physicians should also apply to the use of metrics: they must first do no harm.

Prof Margaret Sheil, AO, FAA, FTSE

Prof Margaret Sheil commenced as Vice-Chancellor and President of Queensland University of Technology in 2018, having previously been Provost at The University of Melbourne (2012-2017) and Chief Executive Officer of the Australian Research Council (2007-2012). Professor Sheil has been an academic in chemistry and held a number of senior roles at the University of Wollongong, including as Dean of Science and Deputy Vice-Chancellor (Research). She is a Fellow of the Australian Academy of Science (AAS), Australian Academy of Technology and Engineering (ATSE),the Royal Australian Chemical Institute (RACI) and Fellow of the Australian and New Zealand Society for Mass Spectrometry (ANZSMS). Professor Sheil is Chair of the Board of the Queensland Museum Network, Deputy Chair and lead Vice Chancellor for Research of Universities Australia.In 2023 Professor Sheil chaired the “Trusting Australia’s Ability: Review of the Australian Research Council Act 2001.”

Contributor

Prof Margaret Sheil, AO, FAA, FTSE

MSAFF

Margaret Sheil, AO, FAA, FTSE

Vice-Chancellor and President

Queensland University of Technology