Non-commercial pharmaceutical R&D: what do neglected diseases suggest about costs and efficiency?

Background: The past two decades have witnessed significant growth in non-commercial research and development (R&D) initiatives, particularly for neglected diseases, but there is limited understanding of the ways in which they compare with commercial R&D. This study analyses costs, timelines, and attrition rates of non-commercial R&D across multiple initiatives and how they compare to commercial R&D. Methods: This is a mixed-method, observational, descriptive, and analytic study. We contacted 48 non-commercial R&D initiatives and received either quantitative and/or qualitative data from 13 organizations. We used the Portfolio to Impact (P2I) model’s estimates of average costs, timelines, and attrition rates for commercial R&D, while noting that P2I cost estimates are far lower than some previous findings in the literature. Results: The quantitative data suggested that the costs and timelines per candidate per phase (from preclinical through Phase 3) of non-commercial R&D for new chemical entities are largely in line with commercial averages. The quantitative data was insufficient to compare attrition rates. The qualitative data identified more reasons why non-commercial R&D costs would be lower than commercial R&D, timelines would be longer, and attrition rates would be equivalent or higher, though the data does not allow for estimating the magnitude of these effects. Conclusions: The quantitative data suggest that costs and timelines per candidate per phase were largely in line with (lower-end estimates of) commercial averages. We were unable to draw conclusions on overall efficiency, however, due to insufficient data on attrition rates. Given that non-commercial R&D is a nascent area of research with limited data available, this study contributes to the literature by generating hypotheses for further testing against a larger sample of quantitative data. It also offers a range of explanatory factors for further exploration regarding how non-commercial and commercial R&D may differ in costs and efficiency.


Introduction
The costs and efficiency of biomedical research and development (R&D) have long been of interest to scholars, practitioners and policymakers alike. These questions have recently gained increased salience in light of concerns about the potentially declining productivity of commercial R&D; missing technologies such as products for neglected diseases of poverty, antibiotics or outbreak-prone pathogens; and the high and rising prices of medicines such as those for cancers or rare diseases 1 . Improved understanding of the biomedical R&D process is essential to address these societal challenges.
Relatedly, the question has arisen as to whether different approaches to organizing, financing or incentivizing R&D -sometimes referred to as "alternative" or "new" business models of R&D -can address some of the shortcomings of the traditional approach. One area where there has been significant experimentation in alternative business models is neglected diseases (ND) that predominantly affect people in low-and middle-income countries (LMICs). It has long been recognized that commercial R&D models did not and would not generate innovative technologies for these diseases because the market incentive is inadequate to do so 2 . In addition to (usually early-stage) research taking place in academic or public institutes, later-stage product development for NDs has received increased funding and attention through the creation of about two dozen product development partnerships (PDPs) to spur R&D into medicines for neglected diseases, such as malaria or sleeping sickness 3 . While there is significant variation in how they operate, a PDP is usually a non-profit organization with a separate and distinct legal identity that enables collaboration to advance the R&D of drugs, vaccines, diagnostics and other health technologies directed at unmet health needs. PDPs are generally funded by public and philanthropic contributions, which allows R&D to focus on health rather than market outcomes. PDPs usually bring together academic, government, industry and philanthropic actors to jointly develop new health technologies. Often, PDPs operate as "system integrators"; that coordinate several partners who perform R&D activities 3 .
With at least two decades of significant non-commercial ND R&D efforts behind us, it is an opportune moment to examine more closely how they compare to traditional commercial R&D on costs and efficiency. Several studies of specific non-commercial R&D initiatives have been published 4-9 , but we did not find any recent research examining costs, timelines or attrition rates across more than one initiative (a ground breaking study was conducted by Moran et al. in 2005 10 , but with a very small dataset given that these organizations were only a few years old at the time).
This study sought to contribute to the knowledge base by gathering evidence on the costs, timelines and attrition rates of non-commercial ND R&D initiatives and analyse how they compared to estimates of commercial R&D. In general, timeline is defined as the time spent to develop a given product and attrition rate is defined as the proportion of projects that did not pass to the subsequent phase out of the total number of projects that entered a particular phase (a wide range of terms is used in the literature to refer to this concept, including "failure rate" and the converse concept of success rate, phase transition rate, approval rate, likelihood of approval, probability of success). Furthermore, we define "non-commercial" R&D as that undertaken primarily with a not-for-profit purpose. Often, the lead organizations of such initiatives are academic or governmental in nature, or non-profit PDPs. For-profit firms frequently play a collaborating role by providing access to compound libraries, technical expertise, and/or products for use in testing, among other in-kind contributions. However, these initiatives are not part of the firm's core commercial portfolio or strategy, as they are not expected to generate significant (if any) market returns. Therefore "non-commercial" should not be interpreted as excluding the private sector. Furthermore, we prefer to use the broader term "non-commercial" rather than "non-profit" as in some cases a developer may earn profit or revenue on a product, but that is not the main purpose for their activities.
The Portfolio-to-Impact (P2I) tool was used as a parameter of comparison in the study. P2I is a modelling tool initially developed by TDR (Special Programme for Research and Training in Tropical Diseases) co-sponsored by the United Nations Children's Fund (UNICEF), the United Nations Development Programme (UNDP), the World Bank and World Health Organization (WHO), to estimate funding needs to accelerate health product development from late stage preclinical study to phase III clinical trials, and to model potential product launches over time. The tool is based on averages of costs, length of phase and probability of success derived from

Amendments from Version 1
In response to comments received, we made a number of edits to version 1 of the manuscript, in summary: We added more information about our parameter of comparison and how the averages used in the P2I Model compare to other estimates available in the literature (new Table 1).
We added additional information about neglected diseases and product development partnerships (PDPs).
We added a definition of timeline and attrition rates, and of NCE Simple and NCE complex.
We added additional references, including a groundbreaking study conducted by Moran et al. in 2005 about non-commercial pharmaceutical R&D.
We added a table summarizing our quantitative data on costs and timelines (new Table 5).
We clarified that the data regarding costs refers to direct costs per candidate per phase.
We reviewed the conclusion regarding attrition rates, as the collected data was not suitable for hypothesis generation, and added additional discussion on the implications for use of the P2I Model and analysis of overall efficiency of non-commercial R&D.
We revised the abstract to reflect the changes made in the text.

REVISED
historical data on health product development for all diseases and all product developers. A description of the tool is available elsewhere 11 . This study was undertaken as part of a TDR-led consortium of organizations that conducted further analysis of the P2I model throughout 2019.

Methods
The study was designed to collect: 1) quantitative data from non-commercial R&D initiatives on costs, timelines and attrition rates, and 2) qualitative data from non-commercial R&D initiatives and/or experts on such initiatives to explain costs, timelines and attrition rates and reasons why these might or might not differ from commercial R&D. Data was collected between June and September 2019. Participants were given the opportunity to review and comment on a first draft of the research report but were not allowed to withdraw their data (a copy of the consent form is available with the full research report as extended data; Annex 4, pp. 91-94 12 ). All data has been aggregated and anonymized.

Participant selection
The study population was selected using the database of pipeline technologies for neglected diseases developed by Duke University and Policy Cures Research 13 . The database does not include all non-commercial R&D initiatives -for example, it excludes biodefense projects that are largely publiclyfunded -but it is the most comprehensive database of which we are aware focusing on R&D for neglected diseases, which is by nature largely non-commercial. We used a version of the spreadsheet "Candidates in the pipeline for neglected diseases, as of August 31, 2017" sent to us by the authors, which included a categorization of the organizations by developer type. We selected all not-for-profit organizations that were directly involved in conducting R&D, which included product development partnerships (PDPs), academic and research institutions and public research institutes and other public sector organizations. There were a total of 443 candidate products and 285 organizations that fit this initial criterion. All 16 PDPs were included given their organizational focus on noncommercial R&D, and the list of PDPs was complemented by other studies 3,14 . In addition to PDPs, we included organizations that had at least one product that had reached Phase 3. Another 32 organizations were included after this second selection, and three additional organizations were included through snowball sampling. The final list of organizations is available in the full research report (extended data; Annex 1, pp. 77-78 12 ). We contacted each organization by email, with the initial request addressed to the organization's most senior executive (e.g. Chief Executive Officer, Executive Director, Managing Director) to ensure leadership was aware of and agreed to our interview request, as recommended by the ethical review process. In specific cases, where we had reason to know another employee would be relevant to or aware of our research project, we copied other individuals on the initial email. The senior executive often delegated the interview to one or more staff, such as the lead staff person responsible for R&D, finance, policy and/or external relations.

Quantitative data -collection and analysis
We developed a questionnaire using MS Excel to collect quantitative data from the organizations in our sample on the costs, timelines, and attrition rates for a given organization or project (extended data; Annex 2, pp. 79-89 12 ). We also included quantitative data on costs, timelines or attrition rates published in reports or articles prior to the start of this study from four organizations [6][7][8][9] . For timelines, we consulted data available on the organizations' websites and in the clinical trials database ClinicalTrials.gov.
We created a quantitative dataset in Excel combining data provided by respondents with publicly available information pertaining to the costs, timelines, and attrition rates of non-commercial R&D initiatives. Data was anonymized and combined by product archetype. Due to the limitations of our dataset, we limited our analysis to only two P2I archetypes: simple and complex new chemical entities (NCE-Simple, NCE-Complex), defined as those with or without validated target or mechanism of action respectively 15 . Only one organization provided information about diagnostics and two about vaccines (one only included aggregated totals for one product), and we excluded these from the analysis as it would be impossible to protect the anonymity of the organizations.
Several assumptions were made to standardize the data and allow comparison (see detailed methodology in extended data; pp. 31-32 12 ). For costs, we assumed that money was spent at a steady rate across the time period of each phase. For data in currencies other than USD, a yearly exchange rate from the year in which the cost was incurred was used to make the conversion into USD. Totals were calculated in 2017 dollars to facilitate comparison with P2I model figures. Inflation and deflation adjustments used the standard consumer price index.
For timelines, we calculated total time spent in development as the sum of time spent in Pre-clinical, Phase 1, 2 and 3. Phase 1a and 1b trials were counted as phase 1. Phase 2a, 2b, and 2c trials were counted as phase 2, as were phase 2/3 tests. Phase 3a and 3b are both counted as phase 3. An average of time per trial for each phase was taken to estimate the amount of time required for a given clinical phase. For some data points, early stage testing was aggregated for multiple drug candidates and total time spent in a phase was divided by the number of candidates. Many candidates have multiple trials in each phase and the average of all trials was used.
Qualitative data -collection and analysis A list of questions for semi-structured interviews was developed to collect the qualitative data (extended data; Annex 3, p. 90 12 ). The questions were not pilot tested. Most interviews lasted about one hour and were either conducted in person in Geneva or using videoconferencing software and were recorded upon agreement of the participant. Interviews were held with individuals with a high degree of familiarity about product development from the organizations included in the study. Interviews were held by the three authors (MV-MPH, researcher, female; RK -researcher, male; SM -PhD., director of research, female) and no one else was present besides the participants and the researchers.
Transcription or notes from the interviews were analysed and coded using NVivo 12. Interviews were coded by MV based on themes derived from the data, and reviewed by SM. A description of the coding tree is not available. Interviews were anonymized both at individual and organization level and each was given a number (PO -participant organization) for quotation identification. Given the small sample size, data saturation was not reached.

Parameter of comparison
We used parameters from the Portfolio-to-Impact (P2I) tool v2, which was initially developed by TDR 11 and updated by Duke University and Policy Cures Research 13 . The P2I Model is based on assumptions of costs, timelines and attrition rates. Assumptions on development costs at each phase were based on clinical trial costs from Parexel's R&D cost sourcebook, derived from historical data on health product development of more than 25,000 candidates for all diseases, and further refined and validated by interviews. The underlying data used to construct those assumptions is not publicly available and it was not possible to disaggregate costs, timelines or attrition rates by commercial vs non-commercial developer. Given that non-commercial R&D (in particular, non-commercial late-stage product development) is both relatively recent and small in scale, we assume that the vast majority of the data used to construct the P2I averages comes from commercial R&D. We compared our quantitative data to P2I averages, and our qualitative data compared non-commercial with commercial R&D.
We conducted a literature review on costs, timelines and attrition rates of biomedical R&D to compare other estimates with those of the P2I Model (available at the Knowledge Portal on Innovation and Access to Medicines). There is a wide range of estimates, most focusing on the development of new chemical entities (NCEs) by pharmaceutical companies (commercial R&D). The literature on this topic has advanced considerably over the past decade, with new estimates for costs and risks based on larger, more representative datasets than earlier studies. The estimated average cost to develop a new drug ranges widely, from $43.4 million to $4.2 billion 16 . Similarly, estimates for timelines also vary widely between studies, and according to type of technology and indication (Kimmitt et al. 2020). Estimates available for success rate -that is the percentage of projects that entered a particular phase of development and passed to the subsequent phase -show overall success rate for the clinical stage (from Phase 1 of clinical trials to successfully ending Phase 3) ranges from 6% to 26% for new drugs, with important variation by therapeutic indication and technology type 17 . Differences in methodology, data sources (many confidential), rate of cost capitalization, types of expenses included as R&D (e.g., mergers and acquisitions, expenses related to marketing of the product or tax deductions), and other methodological approaches provide a wide range of estimates. . P2I timelines are higher for phase 1, lower for phase 2 and within the same range for phase 3 19,22-24 and P2I success rates are lower for complex NCEs and in the same range for simple NCEs 22,23,25-28 . In comparison to the few estimates available for non-commercial R&D initiatives 6,29 , the P2I Model assumptions estimate lower success rates in all clinical development phases, while preclinical is higher. A more detailed literature review is available in the full research report. This study's findings regarding noncommercial R&D should be read in view of the particular estimates for commercial R&D that we used, which are significantly lower than some widely-cited estimates of R&D costs.

Results
We contacted a total of 48 organizations: 23 did not respond, 12 declined and 13 participated in some way (not all participating organizations provided both quantitative and qualitative data) -a participation rate of 27%. In total, we obtained quantitative data regarding 8 organizations and 83 products -37 drug candidates (13 NCEs, 8 repurposed drugs and 16 not specified), as well as 19 vaccine and 27 diagnostic candidates. Qualitative data was obtained from 14 interviews with 20 individuals from 12 organizations; out of those, 18 individuals provided their perspective based on projects conducted within their own organizations and two were experts with knowledge of a range of organizations.

Costs
Quantitative data on non-commercial R&D costs were largely in line with the P2I model estimates, with some variation by phase. For simple NCEs, total costs for non-commercial R&D were 13% higher than the P2I estimates (51.87 million USD for non-commercial vs 45.84 million USD for P2I) ( Figure 1). The largest differences were in pre-clinical and phase 1 -where the costs in our sample were more than double the P2I model estimates. Conversely, phase 2 and 3 trials were less expensive for simple NCEs in our data, but by a small margin. The sample size is too small for statistical significance or to generalize to non-commercial R&D more broadly; rather, the findings merely suggest a hypothesis that costs per candidate per phase to develop simple NCEs are similar between non-commercial R&D initiatives and P2I averages.
For complex NCEs, total costs were similar to the P2I model, 8% lower (53.98 million USD for non-commercial vs 58.93 million USD in P2I) ( Figure 2). In contrast with simple NCEs, for complex NCEs non-commercial preclinical and phase 1 costs were lower than the P2I model. Notably, phase 2 costs were much higher in our dataset (12.65 million USD vs 6.39 million USD in P2I). This could be in part because of the high proportion of phase 2/3 trials in the dataset, as well as the ratio of phase 2b to 2a tests being higher than the P2I data. Phase 3 costs were substantially lower than the P2I estimates, which may be explained by the fact that many pivotal trials were in phase 2. The opportunity to forgo phase 3 testing would drive up phase 2 costs while lowering phase 3 costs. The proportion of pivotal phase 2 tests may be different between P2I and our dataset. As with simple NCEs, the findings merely suggest a hypothesis that should be tested against a larger dataset -that costs per candidate per phase to develop complex NCEs are similar between non-commercial R&D initiatives and P2I averages.
To assess how sensitive our results were to coding by archetypes (i.e. characterizing a product as "simple" or "complex" NCEs), we combined our data from both categories and found that total costs (48.9m USD) lay between P2I's NCE-simple and NCE-complex estimates (45.8m-59.9m USD) ( Figure 3). This sensitivity analysis suggests that total non-commercial costs are largely in line with P2I parameters, even if there are some differences in coding of archetypes.
The qualitative data identified 12 factors that drove costs up or down in the different phases of product development within non-commercial R&D initiatives 12 (Table 2). Most responses focused on the clinical stages of development rather than pre-clinical or earlier. Three factors pushed costs upward, and five factors pushed costs downward for non-commercial R&D in comparison with commercial. Four factors were categorized as indeterminate as they would affect both non-commercial and commercial R&D in the same way. The table below presents a summary of the factors influencing costs. A description of the factors and sample quotes are available in the full research report.
There were more factors that would push costs for non-commercial R&D down vis-à-vis commercial models. However, as the qualitative data does not tell us about the magnitude of the effects, no conclusions can be drawn based on these qualitative data alone on whether non-commercial R&D would generally cost the same, less or more than commercial R&D. Rather, they provide these factors provide potential explanations that merit further exploration

Timelines
The quantitative data showed that for simple NCEs, timelines were roughly similar between non-commercial R&D and P2I averages. Non-commercial R&D had shorter preclinical times (1.65 years vs 2.49 years in P2I), and longer phase 1 times (2.61 vs 1.80 years in P2I). Non-commercial R&D also had much shorter phase 2 times (1.75 vs 3.38 years in P2I), while phase 3 times were slightly higher (3.67 vs 3.18 years in P2I). Overall, our dataset suggested modestly faster timelines for non-commercial simple NCE development, taking 9.67 years vs. 10.85 years in the P2I model ( Figure 4).
For complex NCEs, non-commercial pre-clinical testing was much shorter (1.00 vs 2.87 years in P2I), phase 1 testing slightly shorter (1.67 vs 1.93 years in P2I), phase 2 longer (4.25 vs 3.51 years in P2I), and phase 3 longer (4.0 vs 2.8 years in P2I) ( Figure 5). Overall, non-commercial development time was  The qualitative data identified 12 factors influencing timelines for non-commercial R&D ( Table 3). As with costs, the identified factors were categorized by their potential to push timelines up or down for non-commercial R&D in comparison to commercial R&D. Seven factors were likely to lengthen timelines for non-commercial R&D, no factors were likely to shorten timelines and five factors were categorized as indeterminate.    the organizations in the study and the averages in the P2I Model for the archetype "new chemical entities -simple". The comparison is made for different stages of development preclinical, phase 1, phase 2 and phase 3 and total costs. Data shows shorter times for collected data at preclinical, phase 2 and total, and longer times at phase 1 and phase 3. Collected data for preclinical is based on 2 data points (n=2), for phase 1 n=3, for phase 2 n=3 and for phase 3 n=2. There were a number of factors that would lengthen timelines for non-commercial R&D vis-à-vis commercial models, or that were indeterminate (Table 3). Notably, in none of the interviews did a respondent argue that non-commercial R&D would move faster. As the qualitative data does not tell us about the magnitude of the effect, no firm conclusions can be drawn on whether non-commercial R&D would take generally the same amount of time or more than commercial R&D. We note that Moran (2005) found that pure public-sector R&D took longer than industry averages, while PDP R&D progressed at a similar pace.

Attrition rates
The quantitative data on attrition rates was the most difficult to obtain, and there did not appear to be a standard methodology nor practice of calculating such rates within participating organizations. As all non-commercial initiatives in our sample had relatively small portfolios (compared to large commercial firms), attrition rates might not be meaningful as just one or two candidate successes or failures could lead to large swings in attrition rates. Some organizations provided quantitative data only for costs and timelines of specific products by phase, and not for their overall portfolio, and did not include any information on attrition rates. We judged that the data we received could not be aggregated across organizations as there was limited information for each type of product, nor was it adequate for hypothesis generation. Further research is needed in this area.
Interviewees were also asked about the main factors that drive attrition rates up or down in the different phases of product development. This question generated a wide range of responses, and different organizations took quite different approaches to conceptualizing -let alone calculating -attrition rates. It should be noted, however, that most interviewees expressed the view that P2I averages for attrition rates seemed to be in line with their experience conducting project development in their organizations. There was also reasonable disagreement as to whether or when a higher attrition rate is undesirable. Some interviewees argued that it is beneficial for an organization to "fail early and fail fast" -that is, to have a high(er) attrition rate in pre-clinical or Phase 1. Too low of an attrition rate could also suggest an organization is not taking enough risk, particularly in the earlier and lower-cost phases of R&D.
The qualitative data identified nine factors influencing attrition rates for non-commercial R&D (Table 4). As with costs and timelines, the identified factors were categorized as likely to drive attrition rates higher or lower for non-commercial R&D in comparison to commercial R&D. Three factors were identified as pushing attrition rates higher for non-commercial R&D, one factor as pushing attrition rates lower and five factors were categorized as indeterminate. The table below presents a summary of the factors influencing attrition rates. A longer description of the factors and sample quotes are available in the full research report 12 .
There were more factors that would raise attrition rates for non-commercial R&D vis-à-vis commercial models, than would lower them, but most of the factors raised by respondents were indeterminate. As the qualitative data does not tell us about the magnitude of the effect, no conclusions can be drawn on whether non-commercial R&D would be characterized by higher, lower or equivalent attrition rates as commercial R&D.

Discussion
The quantitative and qualitative data combined paint a complex, if grainy, picture. Keeping in mind the very small sample of quantitative data, the following hypotheses emerge from the analysis.
Regarding costs, the quantitative data suggest that noncommercial R&D total costs are about the same overall as P2I averages for NCEs. It should be noted that in comparison to other estimates available in the literature for commercial R&D, P2I averages are on the low end of the spectrum. The qualitative data identified many more reasons why non-commercial costs would be lower than commercial R&D, but did not shed light on the magnitude of the effects. The overall emerging hypothesis is that total direct costs per candidate per phase of non-commercial R&D are expected to be equivalent or somewhat lower than commercial. Indirect costs for commercial R&D are expected to be higher due to higher overhead and capitalization costs.
Regarding timelines, the quantitative data suggest that noncommercial R&D timelines would be slightly shorter for simple NCEs and equivalent for complex NCEs in comparison to P2I averages. Yet the qualitative data identified many more reasons why non-commercial timelines would be longer than commercial; the data did not shed light on the magnitude of the effects. The overall emerging hypothesis is that timelines of non-commercial R&D are expected to be equivalent to commercial.
Regarding attrition rates, the quantitative data was not adequate for analysis. The qualitative data uncovered more reasons Table 4. Factors influencing attrition rates for non-commercial (vs commercial) R&D. The table lists the factors influencing attrition rates for non-commercial research and development (R&D) in relation to commercial R&D. Factors are classified in three columns: "attrition rate higher", "indeterminate" and "attrition rate lower". There are three factors listed for higher attrition rates, five factors as indeterminate and one factor for lower attrition rates.

Attrition rate higher Indeterminate Attrition rate lower
Limited availability or use of optimization tools Type of technology or product Lower pre-existing standard of care means easier to demonstrate benefit of candidate product Limited scientific understanding of disease Testing for multiple indications Wide prevalence or incidence of the disease means broad target population across which a drug must be shown to be effective

Combinations or regimens
Reluctance to stop the project Differing non-commercial vs commercial reasons for attrition why attrition rates might be higher in non-commercial R&D, but also provided a number of reasons why they might be lower or there might be no difference. Again, the magnitude of the effects is not quantified. The data on attrition rate is not adequate for hypothesis generation.
If non-commercial R&D is characterized by equivalent or lower costs, and equivalent timelines to commercial R&D, the final expected direct costs and time to develop products resulting from a pipeline of non-commercially developed candidate technologies would be equivalent to those resulting from commercial R&D. However, given the lack of suitable data regarding attrition rates for non-commercial R&D, we cannot assess the number of products estimated to be developed from a portfolio of candidates -that is, the cost for a noncommercial R&D model to develop one successful product including the cost of failures. One organization (DNDi) has estimated their costs, including failures, to develop an NCE at 60-190 million EUR 5 , based on industry averages for attrition rates in anti-infectives 29 . Attrition rate is a key variable required to calculate the magnitude of the cost of failures -and therefore the overall costs per successful product of a non-commercial model -rather than just the cost per candidate per phase. However, we were unable to find adequate data for this variable. If attrition rates for non-commercial R&D are higher than P2I estimates, the number of products predicted by the P2I model would be lower, and vice versa.
This study also identified a number of significant differences between non-commercial and commercial R&D. The many variables that affect cost, timelines and attrition rates also highlight that caution is merited when comparing any single trial, product or organization against average benchmarks, as there are many legitimate reasons for departure from the mean. Therefore, the P2I model may need to be modified when applied more narrowly. While differences may get averaged out when the model is applied to a pipeline of nearly 450 candidates across a broad range of diseases (the intended use of the P2I model), they may be magnified in the narrower context of a single disease, technology type, or organization.
Finally, we re-emphasize that the small size and heterogeneity of the dataset means that these are tentative conclusions. Further quantitative research is needed to test these hypotheses against larger datasets. And further qualitative research is needed to deepen our understanding of the strengths and weaknesses of non-commercial R&D initiatives, and how well they function as alternatives to the traditional commercial model, especially beyond neglected diseases where commercial interests are higher.

Conclusions
This was an observational, descriptive and analytic study of non-commercial R&D initiatives. The main limitations of the study were the small non-random sample size and the short period of time in which the study was conducted, which can partially explain the limited amount of quantitative data received. We also recognize that respondents may have had incentives to report costs, timelines or attrition rates that were favourable to their organizations. Although we sought to check quantitative data against publicly available sources, in general very little relevant data was in the public domain or it was only available at a high level of aggregation. As a result, we have sought to be cautious in drawing inferences from the data.
Given the nascent nature of the area, with almost no prior literature focusing on costs, timelines or attrition rates of non-commercial R&D initiatives, we see the merits of this study as generating hypotheses for further testing against a larger sample of quantitative data, and for providing intuition regarding reasons underlying any significant differences between noncommercial and commercial initiatives. The emerging hypothesis is that non-commercial R&D is comparable to commercial initiatives in direct costs per candidate per phase and in timelines. The limited information regarding attrition rates limits any hypothesis generation regarding overall efficiency, however.
It is also important to highlight that many non-commercial R&D initiatives arose because the commercial model did not meet important global public health needs. This study did not compare the patient, population-level, equity or health system benefits offered by the products emerging from non-commercial vs commercial initiatives -only the costs, timelines and attrition rates to develop those products. A fuller comparison could take both into account.
For future research, it may be useful both to expand the dataset on NCEs and also dedicate special attention to improving our understanding of non-commercial vaccine and diagnostics R&D, recalling that we excluded vaccines and diagnostics from our quantitative analysis due to very limited data, and were only able to examine a small sample for simple and complex NCEs.
Finally, in future research it would be useful to interview a broader range of stakeholders. Our interviews focused on practitioners with direct knowledge of non-commercial R&D initiatives involved in product development for neglected diseases, usually employees of the initiatives themselves. A more thorough picture is likely to emerge through interviews with additional non-commercial initiatives, and a broader range of their partners and funders.

Lia Hasenclever
Instituto de Economia, Universidade Federal do Rio de Janeiro, Rio de Janeiro, Brazil The article is of interest because there is a lot of controversy about R&D costs, and it brings new evidence. However, it does contain some inaccuracies. The most serious of these is the use of the term non-commercial. The use of this term is unusual, the most correct would be to use non-profit institutions. Furthermore, what the author calls offset costs, in fact, are indirect costs and not profits. I suggest using the conventional term It would be interesting to add a definition of what the author understands for each of the adopted comparison parameters: costs, timeframes and attrition rates (p….). This would make reading easier and reach a wider audience.

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Yes
Competing Interests: No competing interests were disclosed.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard, however I have significant reservations, as outlined above.
Author Response 06 Aug 2021

Marcela Vieira, Graduate Institute of International and Development Studies, Geneva, Switzerland
We thank you for your valuable comments and suggestions. We have amended the text to incorporate the changes and provide a point-by-point response to your comments below.
1) The article is of interest because there is a lot of controversy about R&D costs, and it brings new evidence. However, it does contain some inaccuracies. The most serious of these is the use of the term non-commercial. The use of this term is unusual, the most correct would be to use non-profit institutions. Furthermore, what the author calls offset costs, in fact, are indirect costs and not profits.I suggest using the conventional term.
Response: We thank you for your suggestion. However, as explained in the introduction, we prefer to use the term non-commercial, instead of non-profit, as some organizations might earn profit, but that is not their main purpose for conducting product development. The term "profit" is used as the difference between the amount earned in revenues and the amount spent to bring a product to market. We removed the part on "offset costs" and rephrased the sentence for clarification.
2) It would be interesting to add a definition of what the author understands for each of the adopted comparison parameters: costs, timeframes and attrition rates (p….). This would make reading easier and reach a wider audience.
Bellow some comments to improve the discussion and conclusions: The study aims to generate hypotheses about the main differences between noncommercial and commercial initiatives. The overall emerging hypothesis is that timeframes of non-commercial R&D are expected to be equivalent or somewhat longer than commercial, but the authors do not comment any other factors that could impact this hypothesis, in order to conduct further studies, i.e.: -The size of the portfolio: a very reduced noncommercial portfolio that could become harder efforts to conduct the clinical studies in opposite with the commercial initiatives where the extended portfolio may leverage the efficiency.
-The fulfillment of regulatory requirements: the commercial initiatives usually have an inhouse or outsourcing regulatory and market access department to fulfill (and anticipate) any regulatory requirement. Is it the same in noncommercial initiatives? could these differences impact the total cost and the frame time of R&D? 1.
There is not any critical appraisal of the used P2I model. Is it comprehensive at all? Is there any relevant factor that wasn't included or missed? How much of the missed information is intrinsic related to the model complexity?
Is it possible to become the model easier to fulfill?
-I would not be worried about the size of the sample, more deeply qualitative data coming from experts and staffers could be rich on bring some other key factors to the table in order to understand the differences and similarities of noncommercial and commercial initiatives.
Having that data, some researchers could rethink the models as P2I to make it easier to fulfill. I mean, to try to do the reverse investigation beginning at the end and go to the beginning.

If applicable, is the statistical analysis and its interpretation appropriate? Not applicable
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
Competing Interests: No competing interests were disclosed.
Reviewer Expertise: Pharmacoepidemiology, regulatory issues and access-to-medicines public policy-making processes.

I confirm that I have read this submission and believe that I have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.
Author Response 05 Aug 2021 Marcela Vieira, Graduate Institute of International and Development Studies, Geneva, Switzerland We thank you for your valuable comments and suggestions. We have amended the text to incorporate the changes and provide a point-by-point response to your comments below.
1) The study aims to generate hypotheses about the main differences between noncommercial and commercial initiatives. The overall emerging hypothesis is that timeframes of non-commercial R&D are expected to be equivalent or somewhat longer than commercial, but the authors do not comment any other factors that could impact this hypothesis, in order to conduct further studies, i.e.: -The size of the portfolio: a very reduced noncommercial portfolio that could become harder efforts to conduct the clinical studies in opposite with the commercial initiatives where the extended portfolio may leverage the efficiency.
-The fulfillment of regulatory requirements: the commercial initiatives usually have an inhouse or outsourcing regulatory and market access department to fulfill (and anticipate) any regulatory requirement. Is it the same in noncommercial initiatives? could these differences impact the total cost and the frame time of R&D?
Response: Thank you for your comments. The full research report (included as underlying data) provides more information about each of the factors mentioned in the interviews, including portfolio size, size of the organization and regulatory expertise. The full research report also includes selected quotes to better illustrate each point and how they might affect costs, timeframes and attrition rates. However, we chose to exclude these more detailed discussions from the article-length version of the study, in order to report the results in a more concise manner.
2) There is not any critical appraisal of the used P2I model. Is it comprehensive at all? Is there any relevant factor that wasn't included or missed? How much of the missed information is intrinsic related to the model complexity? Is it possible to become the model easier to fulfill? I would not be worried about the size of the sample, more deeply qualitative data coming from experts and staffers could be rich on bring some other key factors to the table in order to understand the differences and similarities of noncommercial and commercial initiatives. Having that data, some researchers could rethink the models as P2I to make it easier to fulfill. I mean, to try to do the reverse investigation beginning at the end and go to the beginning.
Response: Thank you for raising this discussion. The literature review and the quantitative and qualitative data collected under this study indicates that the underlying assumptions in the P2I Model for costs and timelines are in line with other estimates available in the literature and roughly similar to the data collected for non-commercial R&D. We believe that the qualitative data --reported in more detail in the full research report --does provide useful additional context regarding how the P2I model may be applied and interpreted, and would refer readers there. The data collected regarding attrition rates was not suitable for analysis and therefore the application of the full P2I model could not be validated. As discussed in the paper, the P2I Model may need to be modified when applied more narrowly to smaller and more specialized portfolios in the context of a single disease, technology type, or organization. Also, as mentioned in the introduction, our study was conducted as part of a TDR-led consortium of organizations that conducted further analysis of the P2I model throughout 2019. The other studies applied the P2I Model to their specific organizations and can be helpful for this discussion. That said, we believe that the Model is a useful tool and fairly easy to use as it requires only that the user inputs information about their average costs, timeframes and attrition rates (which can be modified according to the organization's own figures). However, our study was not directly intended to suggest changes to the P2I Model and we believe that this discussion falls beyond the scope of the paper.

Dear Editor and authors,
Thank you for the opportunity to review this interesting, well-designed, and important study. I suggest accepting with minor changes (treated in F1000Research as 'approved with reservations').
My comments are mostly about interpretation/reporting rather than methodology.
With kind regards, Dzintars *** Results: 'more reasons'-more than what? Oxford comma before "and attrition" could help readability.
Conclusions: Suggest to start this with the headline finding: quantitatively costs and timeline were similar from survey and the P2I model. Currently drafting states attrition rates were also similar but this does not seem to be reflected in the text.

Introduction:
First sentence should be, "have long been". 'simple' and 'complex' NCEs are not standard concepts -please add a sentence explaining (or just a verbatim quote of how Terry/Yamey et al. defined them: Simple -Validated target or mechanism of action, Complex -Novel target or mechanism of action without understanding of disease pathogenesis). This could otherwise be misunderstood: in other contexts, there are 'complex nonbiologics' (e.g. glatiramer), or, separately, the FDA concept of a 'complex API'.

Results:
In general: It would be helpful to have one more table, to summarize the quantitative findings all in one place (in addition to the useful bar charts that compare individual metrics).
The qualitative data provide an interesting outline of the beliefs held by non-commercial drug developers regarding the comparison of their R&D process to commercial process. However, the comparison of 'number of factors' seems very tenuous. I would suggest to delete, "There were more factors that would push costs for non-commercial R&D down vis-à-vis commercial models." as the authors acknowledge the number of 'factors' is meaningless in the following sentence. The following sentence 'no firm conclusions can be drawn' is also fairly self-evident and so would suggest the whole paragraph is deleted. The same for the analogous paragraphs in the following two sub-headings under the qualitative results (timeframes and attrition rates).

Attrition rates:
"As all non-commercial initiatives in our sample had relatively small portfolios (compared to large commercial firms), attrition rates might not be meaningful.
[…] data we received could not be aggregated across organizations, nor was it adequate for hypothesis generation." Please provide a bit more details on your reasoning for not aggregating and not reporting the responses you received, and why you believe small portfolios would make rates not meaningful. Not intuitively clear to me.

Discussion:
"The overall very tentative hypothesis that emerges is that attrition rates for non-commercial R&D would be equivalent to commercial R&D." I do not see any basis for this statement from the findings of the study, even with the 'very tentative' caveat. You mention PDPs often develop products starting with a candidate compound 'donated' by private industry. Is it possible to sub-set which drugs came from a for-profit compound library and which didn't?
It would be worth touching upon, in the Discussion, why (whether?) it was necessary to keep respondent data anonymised and aggregated. Why do non-profits keep these data secret? Was it apparent to the authors, from preliminary discussions or from past experience, that non-profits would be unwilling to share data unless anonymised and aggregated? If so, interesting to the reader and valuable to future researchers to state this, I think.
The quantitative part of the study effectively ends up being a validation exercise for the P2I model. That is valuable and interesting, but could be discussed a bit more in Background and Discussione.g. pointing out that P2I has not been previously externally validated (to my knowledge?) and the implications of the validation provided by this study, i.e. that P2I can indeed be used to plan ND R&D project funding.

Is the work clearly and accurately presented and does it cite the current literature? Yes
Is the study design appropriate and is the work technically sound? Yes Response: Thank you for the suggestion, we changed the conclusions about attrition rate, and specified that the quantitative data suggested that costs and timelines are largely in line with commercial averages. Response: We thank you for the suggestions, however we believe that the first article focuses on another topic regarding the quality and benefits of new approved drugs, which is not a topic we discuss in our paper. We added the second reference in the introduction.
6) For the general reader, one or two more sentences explaining what NDs are would be useful, and that there has historically been next to zero private sector investment due to no prospect of return on investment. The same for the concept of PDPs.
Response: We agree and added one sentence further explaining neglected diseases and another on PDPs.
7) I would suggest the term 'development time' could be clearer than the term 'timeframe'.
Response: We thank you for the suggestion, and changed "timeframe" to "timeline" throughout the paper.  Table 5).
11) The qualitative data provide an interesting outline of the beliefs held by non-commercial drug developers regarding the comparison of their R&D process to commercial process. However, the comparison of 'number of factors' seems very tenuous. I would suggest to delete, "There were more factors that would push costs for non-commercial R&D down visà-vis commercial models." as the authors acknowledge the number of 'factors' is meaningless in the following sentence. The following sentence 'no firm conclusions can be drawn' is also fairly self-evident and so would suggest the whole paragraph is deleted. The same for the analogous paragraphs in the following two sub-headings under the qualitative results (timeframes and attrition rates).
Response: We thank you for the valuable comment, however, as we are trying to develop intuition for hypotheses generation, we decided to keep the sentence stating the number of factors that were identified in one direction or the other, with the recognition that the magnitude of the effect of each factor is not known (as already stated). We believe that counting the number of factors that might differentiate commercial and non-commercial R&D is relevant for the discussion in the paper, and that we have accurately stated the limitations. We have, however, deleted 'firm' before conclusions, to state more forthrightly that no conclusions can be drawn.
Response: Reworded. 13) Attrition rates: "As all non-commercial initiatives in our sample had relatively small portfolios (compared to large commercial firms), attrition rates might not be meaningful.
[…] data we received could not be aggregated across organizations, nor was it adequate for hypothesis generation." Please provide a bit more details on your reasoning for not aggregating and not reporting the responses you received, and why you believe small portfolios would make rates not meaningful. Not intuitively clear to me.
Response: We changed the text to better clarify why the data could not be aggregated (it was not reported by all organizations, and it was not calculated consistently across those that did) and better explain the small portfolio effect (one or two failures lead to large swings in failure rates). Discussion: 14) "The overall very tentative hypothesis that emerges is that attrition rates for noncommercial R&D would be equivalent to commercial R&D." I do not see any basis for this statement from the findings of the study, even with the 'very tentative' caveat.
Response: We modified the discussion and conclusion regarding attrition rates to state that the data does not allow for hypothesis generation.
15) Development costs are hugely below what is reported by private industry -why? Also interesting to note that -my impression without looking up precise figures -development times are fairly similar to what industry reports, while costs are much lower. That makes intuitive sense as there is an inherent minimum development time needed for trials to see drug effects etc., while costs can be more labile and may be misreported.
Response: Thank you for your comment, we added more information about the parameter of comparison, further explaining that P2I averages are in line with other estimates for commercial R&D but there are a couple of estimates that are much higher (and added a table illustrating the range of estimates for costs -new Table 1). However, explaining the reasons why these estimates are so much higher is beyond the scope of the paper.
16) Also, development costs seem substantially lower than DNDi report -why? (Drugs for Neglected Diseases initiative (DNDi) (2019), 15 Years of Needs-Driven Innovation for Access: Key Lessons, Challenges, and Opportunities for the Future, Geneva: DNDi.) Response: We added the DNDi report in the table illustrating the range of estimates for R&D costs (mentioned above). Explaining the reasons why the estimates differ among each other is beyond the scope of the paper. However, it is only preclinical costs in P2I that are lower than DNDi figures. DNDi costs for Phase I is in-between P2I NCE Simple and Complex and DNDi figures for Phase II and III combined are lower than P2I averages. We also flag in the conclusions the DNDi estimate for cost to bring one successful product to market, including cost of failures --but this is a different figure from the per candidate/per phase numbers that P2I uses and that we report in the new Table 1. 17) You mention PDPs often develop products starting with a candidate compound 'donated' by private industry. Is it possible to sub-set which drugs came from a for-profit compound library and which didn't?
Response: The quantitative data received does not allow for this disaggregation and the information about the origin of the compound was not provided.
18) It would be worth touching upon, in the Discussion, why (whether?) it was necessary to keep respondent data anonymised and aggregated. Why do non-profits keep these data secret? Was it apparent to the authors, from preliminary discussions or from past experience, that non-profits would be unwilling to share data unless anonymised and aggregated? If so, interesting to the reader and valuable to future researchers to state this, I think.
Response: As stated under "data availability", the data was aggregated and anonymised to protect participant confidentiality as required by the Ethics Review Committee given the small number of organizations active in the field. Participant organizations were not asked if they would be willing to share their data openly, and we believe that this discussion falls beyond the scope of the paper.
19) The quantitative part of the study effectively ends up being a validation exercise for the P2I model. That is valuable and interesting, but could be discussed a bit more in Background and Discussion -e.g. pointing out that P2I has not been previously externally validated (to my knowledge?) and the implications of the validation provided by this study, i.e. that P2I can indeed be used to plan ND R&D project funding.
Response: Upon reflection, we have revised the paper to emphasize that we cannot draw any conclusions regarding attrition rates. Given the lack of data suitable for analysis regarding attrition rates, we have also amended our conclusions; our data does validate P2I estimates regarding costs and timelines, but does not validate the full P2I Model in terms of predictions for a portfolio of non-commercial R&D projects given the absence of attrition rate data. We added this in the discussion and conclusion sections. As we already discussed in the paper, the P2I model may need to be modified when applied more narrowly to smaller and more specialized portfolios in the context of a single disease, technology type, or organization. Also, as mentioned in the introduction, our study was conducted as part of a TDR-led consortium of organizations that conducted further analysis of the P2I model throughout 2019. The other studies applied the P2I Model to their specific organizations and can be helpful for this discussion.
The paper seems to compare non-commercial (PDPs/academia) research vs contracted research organizations (CROs). This is therefore not a comparison non-commercial vs pharma research. Pharma research adds substantial complexities in terms of prices, timelines and attrition which are not analyzed in the paper. However, the paper does not make sufficiently clear this point. Also, CROs are contracted by PDPs (as well as by industry), so their performance on costs/attrition/timeframes are somewhat determined by this partnership relationships and cannot be considered in isolation.

2.
The data set used for the paper is preliminary and limited in size, as acknowledged by the authors. As the study identifies the need for future more in-depth research, the authors may wish to connect to ongoing analysis carried out by PDPs for stronger data and evidence building. For example, a more granular analysis of the benchmarks on costs, timelines and attrition rates could usefully be added, including looking at the comparative advantage of PDPs specifically in terms of lower personnel costs, support in capacity building for research/manufacturing in the south, portfolio approach mitigating the effects of attrition.

3.
Competing Interests: I am the advocacy director in a product-development partnership.
The benefits of publishing with F1000Research: Your article is published within days, with no editorial bias • You can publish traditional articles, null/negative results, case reports, data notes and more • The peer review process is transparent and collaborative • Your article is indexed in PubMed after passing peer review • Dedicated customer support at every stage • For pre-submission enquiries, contact research@f1000.com