Exploring the effects of temporal evolution in open source software projects
de Waard, Sven (2023-07-31)
Exploring the effects of temporal evolution in open source software projects
de Waard, Sven
(31.07.2023)
Julkaisu on tekijänoikeussäännösten alainen. Teosta voi lukea ja tulostaa henkilökohtaista käyttöä varten. Käyttö kaupallisiin tarkoituksiin on kielletty.
avoin
Julkaisun pysyvä osoite on:
https://urn.fi/URN:NBN:fi-fe2023081195225
https://urn.fi/URN:NBN:fi-fe2023081195225
Tiivistelmä
This study addresses the governance and coordination challenges faced in open source software (OSS) development projects, a modern development approach that has seen rapid adoption in recent years. Unlike traditional software development, OSS projects involve a geographically dispersed base of volunteers working on the code openly, usually without formal hierarchies or contracts. As a result, OSS projects can face scalability issues, such as developers freely abandoning projects or disputes leading to project forking. The study of OSS governance and its underlying mechanisms has seen recent interest. Some of this research has proposed that social dynamics and community structures may change as a result of evolution and growth. This study therefore is an attempt to understand how different types of OSS projects evolve over time. The goal of this study is to identify evolutionary patterns in the community's approach regarding the trade-off between innovation and sustainability. The research question to be answered is as follows: What evolutionary patterns can be identified with regards to the community and its approach to innovation and sustainability of open source software development projects?
To answer this question, a quantitative research has been carried out based on the data of over 1,500 open source software projects created by Google, Microsoft, or Apache. This data has been retrieved through the GraphQL API of GitHub – the world’s largest open source development platform. With the help of Python scripts, this data has been analyzed to identify certain phenomena – patterns – in the evolution of OSS projects.
Based on the academic literature, a framework is developed to categorize OSS projects, emphasizing the trade-off between innovation and sustainability. The results indicate that after the first project release, the attention towards smaller features increases. Innovation and sustainability levels are however not affected. Over time, projects with high innovation levels tend to transition towards a more defensive approach. Simultaneously, projects with initially low sustainability levels show improvement and reach sustainable levels after the first year. Notably, the size of the community is a key predictor for new contributor inflow, while having more pull request reviewers proves effective in both contributor retention and innovation. Interestingly, contributors tend to remain engaged for longer periods when involved in non-profit sponsored projects compared to for-profit sponsored projects. Moreover, a change in platform ownership does not significantly impact other organizations within the platform. Lastly, the study reveals that early contributors show longer retention, whereas the inflow of new contributors gradually decreases as OSS projects age.
These are the identified evolutionary patterns in open source software projects and show that while there are inherent differences between such projects, they do commonly follow the same or similar events. The degree of change is dependent on organizational and project characteristics. As this research is focused on solely sponsored open source projects, further research could focus on examining community-founded projects and comparing them with the results of this study. Additionally, investigating the relationship between contributor turnover, innovation, and project sustainability would be valuable.
To answer this question, a quantitative research has been carried out based on the data of over 1,500 open source software projects created by Google, Microsoft, or Apache. This data has been retrieved through the GraphQL API of GitHub – the world’s largest open source development platform. With the help of Python scripts, this data has been analyzed to identify certain phenomena – patterns – in the evolution of OSS projects.
Based on the academic literature, a framework is developed to categorize OSS projects, emphasizing the trade-off between innovation and sustainability. The results indicate that after the first project release, the attention towards smaller features increases. Innovation and sustainability levels are however not affected. Over time, projects with high innovation levels tend to transition towards a more defensive approach. Simultaneously, projects with initially low sustainability levels show improvement and reach sustainable levels after the first year. Notably, the size of the community is a key predictor for new contributor inflow, while having more pull request reviewers proves effective in both contributor retention and innovation. Interestingly, contributors tend to remain engaged for longer periods when involved in non-profit sponsored projects compared to for-profit sponsored projects. Moreover, a change in platform ownership does not significantly impact other organizations within the platform. Lastly, the study reveals that early contributors show longer retention, whereas the inflow of new contributors gradually decreases as OSS projects age.
These are the identified evolutionary patterns in open source software projects and show that while there are inherent differences between such projects, they do commonly follow the same or similar events. The degree of change is dependent on organizational and project characteristics. As this research is focused on solely sponsored open source projects, further research could focus on examining community-founded projects and comparing them with the results of this study. Additionally, investigating the relationship between contributor turnover, innovation, and project sustainability would be valuable.