Authorship disparity has been analysed using techniques like Proportion Analysis, Gender Ratio, Female-to-Male Authorship Odds Ratio (FAOR), Linear Regression, Linear Mix Model, Prestige Index, Average Annual Growth Rates, Chi-Square Test, Z-Test, Binomial Regression Model, Collaboration analysis, Gini coefficient, Disparity Index, Actual Author Contribution of Co-authors, Normalized Citation Indicators, etc. (Table 2). Cole and Zuckerman (1984) conducted the study to indicate simple bivariate differences in research publications by gender.
Table 2. Techniques used for analysis
Technique
|
Description
|
Datasets
|
Authors
|
Remarks
|
Ratio analysis:
|
Proportion analysis
|
It is share of women in total authorship of scholarly articles.
|
Nature Index; Web of Science
|
Bendels et al. (2018); Holman et al. (2018)
|
It can measure the share of women authorship in total.
|
Female-to-Male Authorship Odds Ratio (FAOR)
|
It is ratio between share of female authorship position in total female authorship and share of male authorship in total male authorship.
|
Nature Index; Web of Science
|
Bendels et al. (2018); Bendels et al. (2018b)
|
- It can be calculated for all first authors.
- At least two/three authorships require for calculation of last- and co-authors’ FAOR.
|
Regression analysis:
|
Linear regression
|
---
|
US National Library of Medicine
|
Ouyang et al. (2018)
|
It measures the trends in authorship.
|
Logistic Regression Model (LRM)
|
Linear relationship of gender of author with parameters of publication.
|
Journal publication
|
Filardo et al. (2016)
|
It measures the relationship between author’s gender and publication parameters.
|
Linear mix model
|
The model fit obtained with % of women authors as the response variable.
|
PubMed; arXiv
|
Holman et al. (2018)
|
It estimates the share of women authorship across the positions and countries.
|
Binomial Regression Model
|
|
Survey: Paper and Electronic
|
Kaufman and Chevan (2011)
|
It is useful to assess the relationship of gender with peer-reviewed articles.
|
Coefficients and indices:
|
Gini coefficient
|
It analyse the degree of disparity in authorship of scholarly articles.
|
Web of Science
|
Zeng et al. (2016)
|
It estimates degree of disparity in authorship credits within co-authors.
|
Disparity Index
|
It measures the weights of collaborations.
|
Web of Science
|
Zeng et al. (2016)
|
It estimates the distribution of author credits.
|
Prestige Index
|
It indicates the holding of prestigious authorship.
|
Web of Science
|
Bendels et al. (2018); Bendels et al. (2018b)
|
It calculates the holding of prestige authorship and distribution with co-authors.
|
Relative intellectual contribution
|
It measures the actual contributions of collaborators in published article.
|
|
Rahman et al. (2017)
|
It is useful to measure the actual contribution of the collaborator to avoid bias interpretation of authorship.
|
Average annual growth rates
|
It measures the changes in women authorship.
|
Web of Science
|
Bendels et al. (2018b)
|
--
|
4.1 Ratio Analysis
4.1.1 Proportion Analysis
Proportion of female authorship can be simply defined as ratio of female authorship to total authorship by male and female and multiplied by 100 for better readability (Equation (1)). Bendels et al. (2018) have used proportion analysis for understanding the share of female authorship of Nature indexed high quality journals. Proportion of female authorship (PFA) was calculated (Bendels et al., 2018) as:
\(PFA= {FA \over FA+MA} \times 100\) (after Bendels et al., 2018) (1)
\(FA\) = female authors and \(MA\) = male authors
PFA shows the quantitative representation of women in authorship of scholarly articles (Bendels et al., 2018)
Holman et al. (2018) have used the gender ratio for analysis of representation of women as: 1) first authors in multi-authored articles, 2) last authors in multi-authored articles, 3) authors of single authored article, and 4) overall authors- all authors of all published articles. Proportion of women authors ( P ) was calculated using logistic function (equations (2)) as:
\(P= {{e^{0.5rt}} \over {2e^{0.5rt}}+c} \) (after Holman et al., 2018) (2)
where, t is the date, r controls the steepness of the curve and c infliction point. It ‘assumes that the relationship between gender ratio and time is sigmoidal and progresses monotonically either towards gender parity or the complete disappearance of one gender’ (Holman et al., 2018). This analyses the non-linear changes in gender ratio with 95% confidence level.
4.1.1 Female-to-Male Authorship Odds Ratio (FAOR)
It is ratio between share of female authorship position (first-, last-, corresponding- and other co-authors) in total female authorship and share of male authorship in total male authorship (Bendels et al., 2018). It can be calculated (equation (3)) as:
\(FAOR_{First}=FemaleOdds_{First}/MaleOdds_{First}\) after Bendels et al. (2018) (3)
\(FemaleOdds_{First}=FemaleN_{First}/(FemaleN_{Co}+FemaleN_{Last})\)
\(MaleOdds_{First}=MaleN_{First}/(MaleN_{Co}+MaleN_{Last})\)
where, FemaleN and MaleN are number of female and male authorship according to types. Bendels et al. (2018) have been used this FAOR ratio analysis for prestigious authorship index analysis.
4.2 Regression Models
Scholars have been used linear regression (Ouyang et al., 2018), logistic regression model, linear mix model and binomial regression model to estimate the trends of women authorship of scholarly publications in STEMM.
4.2.1 Logistic Regression Model (LRM)
Macaluso et al. (2016) and Filardo et al. (2016) have used logistic regression model to analyse the relationship between gender and authorship type. Macaluso et al. (2016) showed significant association of gender with type of contributorship whereas Filardo et al. (2016) showed this relationship with journal’s impact and its time of publication.
4.2.2 Linear Mix Model (LMM)
Linear mix model was used to find the correlation of author ratio across the countries (Holman et al., 2018). The model (equation (4)) fit was obtained with percentage of women authors as responsive variables.
% \(women \ authors \sim a_1 Position + a_2 Date + a_3 x_1+a_4 x_2+a_5 x_3+a_6 x_4+a_7 x_5+a_8 x_6+a_9 x_7+\)
\((Date/Journal)+(Date/discipline) +(Date/Country)\) (after Holman et al., 2018) (4)
\( x_1\) to \( x_7\) are the seven UN predictor variables, ‘date’ is date of publication, ‘position’ is authorship position (first, middle, last, single, corresponding, etc.). They have been translated in UN predictor as mean 0 and variance 1. Here, journal is random intercept and date of publication is random slope.
4.2.3 Binomial Regression Model (BRM)
Negative binomial regression (BRM) model was used to examine the effects of gender on peer-reviewed articles publication (Kaufman and Chevan, 2011). Poisson model showed over dispersion for count data (Kaufman and Chevan, 2011)
4.3 Coefficients and Indices
4.3.1Gini Coefficient
Scholars have used Gini Coefficient by Gini (1912) to measure the degree of inequalities in statistical distributions derived from Lorenz curves (Ultsch and Lötsch, 2017). Zeng et al. (2016) have used this Gini coefficient (equation (5)) to understand the statistical distribution of authorship credits within the co-authors. It measures the degree of inequalities in the distribution of authorship credits.
\(G(a)= {2 \sum^{n_{c}}_{i=1} iy_i \over n_c \sum^{n_{c}}_{i=1} iy_i} - {n_c+1\over n_c}\) (after Zeng et al., 2016) (5)
Here, \(a\) is author and \(n_c\) are co-authors. Zeng et al. (2016) have counted times of collaboration between \(a\) and \(c_i\) , \(y_i\) . \(y_i\) is next arrange in non-decreasing order as \(y_i \leq y_{i+1}\). Further, Chien et al. (2018) have used this coefficient to analyse the author’s research domain and ordering the author names on scholarly articles.
4.3.2 Prestige Index
Prestige index (PI) can indicate holding of prestigious authorship (first and last authorships) by women compared to men (Bendels et al., 2018; Bendels et al., 2018a). It is prestige-weighted average (equation (6)) of the \(FAOR_{excess ε_t}\) calculated for all authorship types:
\(ε_t=w_t (FAOR_t-1)\) , if \(FAOR_t≥\) , otherwise \(ε_t=w_t (1-1/FAOR_t)\) (after Bendels et al., 2018a)
(6)
\(w_t\) is weighting factor, t is authorship types. Bendels et al. (2018a) have been weighted co-authors negatively ( \(w_{co}=-1\) ) and first and last authors positively as \(w_{first} = w_{last} = 1\) . Estimated PI value is ‘0’ indicating balanced prestigious authorship between women and men whereas values more than ‘0’ show excess and less lack of prestigious authorship hold by women. Higher FAOR for first and last authors increases PI and middle authors (co-authors) show less PI. Bendels et al., (2018) have been excluded the alphabetically arranged authors for PI analysis.
4.3.3 Disparity Index
Zeng et al. (2016) have been calculated the weights of collaboration between \(a\) and \(c_i\) (equation (7)):
\(W_{ac_i }=∑^{k_{c_i}}_{j=1} {1\over l_j-1}\) (after Zeng et al., 2016) (7)
\(k _ {c_i}\) is the number of publication by \(a\) and \(c_i\) together and \(l_j\) is the number of co-authors of publication j . Further they have calculated the total weights of collaborations for author, \(a\) (equation (8)).
\(S_a=∑^{n_c}_{i=1} W_{ac_i} \) (after Zeng et al., 2016) (8)
Finally disparity index was calculated (equation (9)) as:
\(\gamma (a)=∑^{n_c}_{i=1} ({W_{ac_i} \over S_a})^2n_c\) (after Zeng et al., 2016) (9)
Ouyang et al. (2018) have used Mann-Whitney U test and χ2 test to determine significant differences between male and female authors.
4.4 Normalized Citation Indicators
Besselaar and Sandstrom (2016) have field normalized citation indicators to analyze the gender difference in research performance in relation to career progress as:
P: Number of publications, full counting
Frac P: Number of publications, fractional counting based on author shares
NCSf: Field normalized citation score, e.g. 2014
NCSf2y: Field normalized citation score, year window e.g. 2 years
TOP x %: Share of publications in the set of (1, 5, 10, 25 and 50 %) highest cite publications, field normalized
4.5 Collaboration Analysis
Zeng et al. (2016) have used Gini coefficient and disparity index to measure the homogeneity of author collaboration to understand the collaboration opportunities for women authors. Higher Gini Coefficient or disparity index indicates the inhomogeneity of collaboration with meaning that author collaborates with small portion of his/her co-authors and only few attempts with remaining majority of co-authors. Thus, this author has high propensity to collaborate with few co-authors only.
4.6 Actual Contribution of Multiple Authors
Macaluso et al. (2016) and Rahman et al. (2017) have suggested for replacing the authorship with contributorship to illuminate the potential disparities within authors of scholarly publications. Rahman et al. (2017) have been classified author contribution into three groups: 1) Intellectual activities (IA), 2) Logistics support (LS) and 3) IA and LS combined. IA includes initiation of research proposal, review of literature, designing research methodology, technical guidance, instrumental setup, data collection, data analysis and interpretation, writing the manuscript, revisions, etc. Language editing, laboratory facilities and data collection are LS for conducting research project and preparing manuscript. They have suggested relative weights (0 to 1) to identify contribution of each author for each activity ( \( IC^a_i\) ). Relative intellectual contribution for each author ( \( IC^r_i\) ) can be calculated using following equation (equation (10)):
\(IC^r=∑^{n}_{i=1} WF_i \times IC^a_i \) (10) (after Rahman et al., 2017))
where, n is the number of different activities, \( IC^a_i\) is the value of intellectual contribution of each author for each activity i , \(WF_i\) is the weights of intellectual activity i . They have calculated equal weighing of all intellectual activities, different weights for an intellectual activity in different papers and equal weight for multiple intellectual activities. The method will be helpful to calculate actual contribution of each author instead of traditional vague biased method of calculating author contributions. It will remove the misappropriation of authorship credits and misconduct in scientific community (Macaluso et al., 2016).
4.7 Average Annual Growth Rates
Bendels et al. (2018a) have calculated average annual growth rate for PFA [female authorship], FAOR and Prestige Index to understand the temporal development.
4.8 Significance Tests
Budden et al. (2008) have been used ‘Chi-square’ and ‘Z-test’ to analyse the representation of male, female and unknown first author.