[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Community renewal and project obsolescence



* M. Zhou <lumin@debian.org> [2023-12-27 19:00]:

Thanks for sharing the figure. The data seems correlated with the number of new Debian accounts. See the figure below: Python Code for this figure:

 ```
 # modified from ChatGPT.
 # XXX: members.csv is copy-pasted from https://nm.debian.org/members/
 import pandas as pd
 import matplotlib.pyplot as plt
 df = pd.read_csv('members.csv', sep='\t')
 df = df[df['Since'] != '(unknown)'] # filter out invalid data
 df['Since'] = pd.to_datetime(df['Since'])
 df['Year'] = df['Since'].dt.year
 account_counts = df['Year'].value_counts().sort_index()
 smoothed_counts = account_counts.rolling(window=3).mean()
 plt.figure(figsize=(10, 6))
  plt.bar(account_counts.index, account_counts.values, color='skyblue')
 plt.plot(smoothed_counts.index, smoothed_counts.values, color='orange',
 label=f'Smoothed (Window=3)')
 plt.xlabel('Year')
 plt.ylabel('Number of Accounts Created')
 plt.title('Number of Accounts Created Each Year')
 plt.legend()
 plt.savefig('nm-year.png')
 ```

Thanks for the code and the figure. Indeed, the trend is confirmed by fitting a linear model count ~ year to the new members list. The coefficient is -1.39 member/year, which is significantly different from zero (F[1,22] = 11.8, p < 0.01). Even when we take out the data from year 2001, that could be interpreted as an outlier, the trend is still siginificant, with a drop of 0.98 member/year (F[1,21] = 8.48, p < 0.01).

Best,

Rafael Laboissière

P.S.1: The correct way to do the analysis above is by using a generalized linear model, with the count data from a Poisson distribution (or, perhaps, by considering overdispersed data). I will eventually add this to my code in Git.

P.S.2: In your Python code, it is possible to get the data frame directly from the web page, without copying&pasting. Just replace the line:

    df = pd.read_csv('members.csv', sep='\t')

by:

    df = pd.read_html("https://nm.debian.org/members/";)[0]

I am wondering whether ChatGPT could have figured this out…


Reply to: