[Application] Measure the health of the Radicle Community

katerinabc · November 1, 2022, 11:26pm

Project Name: Real-time Community Health Analytics
Team Name: Community Health
Payment Address: 0xE372cf77187E27B15805803Cf331f6b7330F223b
Level: -Seed

Project Overview

Overview

Community Health is a research project to further develop and test a framework for Community Health, create a PoC data collection tool, implement the tool in the Radicle Community and collect interaction, collaboration, and pulse-survey data to validate the framework and provide insights to Radicle.

We’ve been advancing a research project to:

develop a framework for Community Health with actionable metrics
create an open source data collection tool
implement the tool in Radicle
analyse and triangulate members’ interactions with their perceptions of ownership in the community, and their contribution to Radicle product to validate the Community Health framework and provide insights to advance the Radicle community.

An indication of how your project relates to / integrates into Radicle.

Community Health is providing community managers with key health indicators for their community grounded in scientific research. We go beyond engagement to truly understand community members’ perception of their community. This data will compliment Radicle’s own health check by focusing on the “soft side”: relationships between people. In addition, it will help with the transitions to the DAO by providing baseline metrics about how members feel about Radicle and how they interact with each other.

Project Description

Problem Space

Current Community dashboards (e.g., Orbit, Commosor, Blazer) put a strong focus on members’ posting behavior and event or meeting attendance. Community members are treated in isolation from each other, ignoring that humans are social beings and we thrive thanks to the interactions we have with others. Hence, community metrics solely based on posting behavior ignores that the building block of a healthy community isn’t just posting messages but interaction between people.

Problem Solution

We are creating community health checks that are based on the relationships between people and their sense of belonging and ownership with the community. We do this by creating a science-based community health framework. This framework rests on years of research on communities and social network research.

We are using two data sources for our health check: Discord data for measuring the relationships and computing network metrics, and tiny pulse data to measure members perceptions. Together these two data sources provide us with insights into members outer world (their interaction with others) and inner world (their feeling towards the community).

Current stage: MVP
We have completed our conceptual framework and are delivering in-depth reports for clients while the dashboard is being developed. We have completed a first round of user tests for our prototype design.

Deliverables

Please list the deliverables of the project in as much detail as possible. Please also estimate the amount of work required and try to divide the project into meaningful milestones.

Total Estimated Duration: 16 weeks
Total Costs: $30,000 in USDC (or equivalent)

Milestone 1 - Data collection tool development

Estimated Duration: 10 weeks
Costs: $20,000

Number	Deliverable	Specification
1.	Discord Pulse bot	Develop a well-being survey members answer in Discord. User is in control of their data and with whom it will be shared.
2.	Discord data extraction	Refine how data is extracted (e.g., speed, number of channels)
3.	Dashboard development	Implement the prototype to create a working interactive dashboard.

Milestone 2: Data Analysis, Insights Report

Estimated Duration: 4 weeks
Costs: $8,000

Number	Deliverable	Specification
1.	Collect SNA data	Collect data from the agreed channels
2.	Run well-being survey	Implement the wellbeing survey in Radicle. We have 2 default questions and offer a set of other wellbeing questions that can be asked depending on the needs of the community.
3.	Data analysis	Clean and analyze the data. We have a default script for data analysis which is aligned with our conceptual framework. However, to have more context and make better sense of the data, we would like to know what is happening in the community.
4.	Sharing results

Milestone 3: Community health workshop

Estimated Duration: 2 weeks
Costs: $2,000

Number	Deliverable	Specification
1.	Workshop	We offer to early customers tailored workshops, open to their community, to discuss the results in detail and co-create an action plan/recommendations.

Future Plans

Our plan is to have a fully functional dashboard delivering actionable insights for community managers and those responsible for the future of DAOs and members well-being. We have applied to several small grants to support the development of our dashboard.
The long-term future of the project will be sustained through a combination of client revenue and (academic) research grants.
We plan to collaborate with academic institutions to ensure further development of health metrics. This will first happen through the team’s network with academic institutions in Europe and the US.
We plan to include other data (e.g, Dework, Radicle) to provide more detailed information about community members’ communication and coordination patterns.

Team

Team members

Katerina (team leader; data science)
Danielo - Product
DenisaFox - Design
Thegadget.eth - development

Contact

Contact Name: Katerina
Discord: katerinabc#6667
Website: rndao.info

Legal Structure

Registered Address: na
Registered Legal Entity: na

Team’s experience

Our team has worked together for the past 6 month developing the proof of concept and building a basic MVP for data collection and analysis. We have delivered a first health check for Aragon and are working on two more this months.

katerina: has a Ph.D. using social network analysis. Since 2016 she is co-instructing a graduate course on data analytics for HR at Northwestern University. She has also co-organized the Learning in Networks sessions at the International Conference of Social Network Analysis (2018 - 2020), and previously advised a people analytics company on social network metrics.
Twitter: twitter.com/katerinabohlec
Linkedin: linkedin.com/in/katerinab
Github: https://github.com/katerinabc/

Daniel: Previously, Head of Governance at Aragon, 8 years experience in Organization Design consulting (clients include Google, BCG, Daymler, The UN, and multiple startups), and visiting lecturer at Oxford University.
Twitter: https://twitter.com/_Daniel_Ospina
LinkedIn: https://www.linkedin.com/in/conductal/

DenisFox: has more than 5 years of experiencing in UX.
Figma: https://cutt.ly/DenisaBrichtova
LinkedIn: https://www.linkedin.com/in/denisabrichtova

thegadget: Software Engineer. Previously, Product Manager at Neolyze (Business Intelligence Dashboard for Instagram)
Github: https://github.com/thegadget-eth/
Twitter: https://twitter.com/mr_gadget22

Team Code Repos

Additional Information

We have published a conceptual framework describing our view on what a healthy community is.
Early development of this project has been funded by Aragon, Polygon DAO, Aave, MetaCartel and Near.
We have delivered a community health check for Aragon (only report; still planning the workshop) and are preparing one for 2 other DAOs.

abbey · November 4, 2022, 6:07pm

Hey @katerinabc! Welcome!

Thanks for the proposal. I had a couple clarifying questions I wanted to discuss:

When we say “community” what stakeholders do we actually mean here? Our contributors? Our token holders? Our general audience?

Are there any case studies we can read regarding the details of the framework and how they have been applied to other communities?

What type of data is being extracted?

Who will this survey be targeted at?

Thanks! Looking forward to hearing from you.
Abbey

katerinabc · November 4, 2022, 8:31pm

Thanks @abbey for your questions. I’m going to answer them in a different order as some of the answers build on each other.

At this moment, we are extracting the following data from discord:

who posted a message
who replied to a message
who reacted to a message using an emoji

We are not extracting what is being said. Our data extraction method is currently not efficient. This is what I meant with “refine”. I realize it isn’t a good word choice. Improve would have been better. ~~I’ll edit the proposal by replacing “refine” with “improve”~~ (can’t edit anymore)

This is a fair question we touched upon in our conceptual framework. In practice the community health check is limited to Discord, and there again only to the channels you give us access to. If you like to include on-chain data (e.g., voting) we could collaborate with DiamondDAO, but I’ll have to discuss it first.

The survey will target “community members”. As I said above, community here is limited to those who are on Discord. Our survey bot requires as input a discord role. The survey is send to those with the matching role. This gives you the option to target specific groups of people.

Unfortunately at this moment we do not have a published case study. We have completed the health check for Aragon, but it is in their hands to publish it. I’m doing right now the health check for MetaGame. We do have the conceptual framework that talks about how we frame our thinking.

I hope I have answered your questions. Please let me know if you have further questions or comments.

matto · November 5, 2022, 8:41pm

I love the idea and motivations behind this and think it’s addressing a crucial and overlooked topic in the DAO space.

Some Qs:

Is the goal to make this a fully automated product that can be rolled out to many organisations?
How confident are you in being able to create accurate and insightful analysis using only Discord data? Is there any existing research that supports this approach, or is this work the research?
Have you secured additional funding from elsewhere for this stage of the project? If not are you planning to?
Are you able to share the prototype designs you mentioned?
Could you share some more information about the engineering side of the project? (lead, contributors, experience, etc)

katerinabc · November 9, 2022, 11:56am

thank you for your questions @matto and your vote of confidence on our work.

Question 1

Yes that is the goal. We will combine the automated dashboard with extra research services (consulting) and developing partnerships with academic institutions. I have a meeting about this next week.

**Question 2 **

Our approach is to collect discord data and run pulse survey. Discord data is “objective behavior data” in the sense that there is no human bias when collecting the data. Normally social network data is collected with surveys. People would answer a question like “how often do you talk with [team member A]”. This is subjective and prone to over-or underestimation. In this way discord data is better than survey data for measuring actual behavior.

We complement the discord data with a traditional pulse survey. We use the pulse survey to measure community members’ perception about how they feel about the community. A pulse survey is a tiny survey (5 questions) that is asked frequently. For perceptions we need to get into “people’s head” so here a survey is the best. Other, potentially more accurate methods, are just not possible (e.g., physiological measurements).

You asked about accuracy of the health check with only discord data. As someone who has conducted social science research for more than 10 years, perfect accuracy does not exists. Discord data gives us a good proxy. That’s why we also have the pulse survey.

Research supporting this approach:

To summarize, I am confident that we are on the right track building an accurate and insightful community health dashboard.

Question 3

Yes, we have secured funding from Aragon, Polygon DAO, MetaCartel, Near, and Aave.

Question 4

We finished the first round of user testing last week and are incorporating the feedback we received. You can book a call with @danielo to see where we’re at and where we’re heading: https://meetwithwallet.xyz/daniel/community-health

Question 5

thegadget is our product lead (https://github.com/sepehr2githu; thegadget-eth (Mr gadget) · GitHub). He has experience developing BI services, Chrome Extensions, and also mobile application using react native.

Other contributors on the tech team are:

Mehrdad (tech lead; github; linkedin): 5 years of experience as back-end developer and one year as business analyst/project manager (at CBI Global). He co-founded a BI start up providing social media analysis to digital marketing agencies. Through this, he gained a lot of experience in data analysis, ETLs, micro-service architectures, messaging services (RMQ, Kafka), and different databases (MongoDB, Redis, Elasticsearch, Neo4j and Postgres).
Webmiracle (backend; github): Blockchain engineer and full-stack developer with over 8 years of extensive web experience. He worked for weave.financial as a blockchain and frontend developer. Previously he worked at Fantompayxyz.com, on smart contracts and a reflection token, FTMpay on Fantom network and on Web3 integration to interact with backend using Nextjs.
MagicPalm (backend; github): blockchain engineer and a fullstack developer with over 8 years of experience in professional software development.
Nima (frontend; github): front-end developer with more than 3 years of experience in developing applications based on Vue.js and React.js. He developed a video stream platform based on WebRTC and Socket. In his previous company, he contributed to a 50% increase of customer data quality through the development of a module based on Parcell and typescrip.

In addition to this, we have two team advisors: Waka (github) and Sam (github).

On the data science side, I’m working closely with Tjitse (github): Tjitse has experience with large scale biological data analysis. Currently, he is completing his PhD in dynamical neuroscience where he is using graph theory and advanced statistical methods to study how neurons self organize into interaction networks. He is applying the same techniques to study community member interaction networks (Complexity vs stability, an ecosystem perspective). His work was published in Nature (The code for this research is available here).

katerinabc · November 24, 2022, 11:40am

@abbey and @matto thank you for your interests. Could you please provide an update on the grant application?

matto · November 29, 2022, 3:01pm

Thanks a lot for the detailed reply and sorry for not responding sooner!

This looks like a great project with a great team and I would fully support funding it.

I’m not a part of the grants committee or grants process however, so they’ll need to update on the status of the application.

@bordumb @abbey ?

bordumb · November 29, 2022, 11:06pm

Hello,

Thanks a bunch for this application.

I’ve got one question that is “blocking” (i.e. I won’t be able to vote in favor until resolved) regarding identity.

And some clarifying questions after that.

Hard Data Problems

1. Identity

The proposal mentions starting with Discord data.

How do you imagine we could incorporate future data from other sources?
Namely, how would we tie in user-level data from other sources?

Usually in “user health” type analyses, you need a common join key, such as a userId/userName that is common across all datasets involved.

I imagine we will have Discord usernames, but then say various other identities (Radicle, GitHub, etc.) all with their own usernames. There doesn’t seem to be a way we’d be able to tie the user IDs/user names together. So any insights from one platform is unlikely to join to data from another.

This will limit us to generalized statements like:

These 5 peoples’ activity on Discord looks inactive/unhealthy
These 7 peoples’ activity on Radicle looks active/healthy
But we don’t know if the same people who are looking unhealthy on Discord are also unhealthy on Radicle

Note: my point here is that this is a fundamental problem and unless there is a solution, I thinking moving beyond a simple Discord POC will be difficult. I’d like to see what thinking there is on either (a) how to solve this identity problem entirely or (b) how it can be worked around.

2. Metrics

This framework rests on years of research on communities and social network research…Hence, community metrics solely based on posting behavior ignores that the building block of a healthy community isn’t just posting messages but interaction between people.

2 questions:

1. Can you please site some of the research you think will help guide this work?
I know a bit about network theory (e.g. using eigenvector/centrality to understand the importance of nodes in networks). It’s a bit hard to tell what direction this work would take without knowing what research it’s founded upon. I’m curious what’s inspiring this work.

2. Can you please provide some metrics you’ve thought about beforehand?
I understand there will likely be ad-hoc analyses done to determine suitable metrics once the data pipeline is built and data pulled. So I’m more just curious to see what signals you imagine will be helpful.

Thanks!

katerinabc · December 7, 2022, 11:47am

Thanks @bordumb for your questions.

Identity

Yes, user-identification is an issue we need to tackle to go beyond Discord and overall system-level health metrics. I’d like to specify that, while we are finalizing our privacy policy, we do not tie our health metrics to specific users (e.g., user xyz is contributing positively/negatively to the community’s health), but keep all metrics at the system-level.

I used to do co-citation analysis of academic literature. There author disambiguation was also an issue (e.g., Is Peter John Smith, Peter J Smith and Peter Smith one person, two people or three?). We solved the problem using a combination of fuzzy matching algorithm and network graphs (co-author network, author-affiliation network). For some a manual check was necessary. Below are three solutions we are exploring:

Solution 1: User-provided linked profiles (automatic)
Other community analytics tools (e.g., Orbit) automatically merge those users for which they have a common identifier. For example, on Github people can link their Twitter profiles, providing a way to merge the person’s Github and Twitter activity.
Therefore, the next tool to integrate in our community health analysis should (a) make strategic sense, and (b) allows us to augment discord profiles with more information. From a strategic perspective we are either looking at work coordination tools aka “tools where work gets done” (eg. Dework, Wonderverse, Github, Radicle) or organizational design and governance tools (e.g., Snapshot, Sobol).

Solution 2: author disambiguation
Assuming that community members profile name does not vary across platforms, we can calculate how well names match. We would then provide our users a list of “duplicate members” (online identities that are linked to the same real person) than can be checked and merged. As this approach works well for longer names, we need to include a way for our users to check the list of “duplicate members” and let them manually merge members where necessary.

Solution 3: User upload common identifiers
Another solution is for our users to provide a list of community members’ discord handles and their matching identities on other tools. This can be implemented in two ways:

Community moderators collect the information and upload it.
We turn the dashboard upside-down and provide a portal for community members. This portal would give them their unique health statistics (e.g., activity level, size of network, reach/influence in the network) and would let them add other profiles (e.g., github, telegram). We would use these user-provided linked identities to merge data from the different tools the community manager wants to integrate.

Metrics

The research is based on organizational behavior research, mainly on studies of knowledge sharing patterns in teams and (formal) learning communities. For example, there is a body of research on knowledge sharing in organization.

Beginning with Granovetter’s (1977) theory of strength of weak ties, you need strong ties (frequent interaction) to share tacit, hard to codify knowledge. Weak ties are good for getting access to unique and novel information. This is further supported by work Hansen has conducted (e.g,. The search-transfer problem, 1999)

So weak ties (“infrequent communication”) are good for innovation. But it’s a bit more complicated than that. Research on team assembly of science teams has shown that, especially for cutting edge innovation, when creating a team, team members prefer familiarity (strong ties) to novel information (weak ties) to reduce the risk that team members can’t get along with each other (Lungeanu et al. 2015).

From this body of research, we assumed that communities should have some level of fragmentation. That would give people the space to discuss and work on very specific topics. But a community also needs boundary-spanners. People who bridge silos and make sure the small-groups are staying connected. These boundary-spanners reduce the level of fragmentation.

From research on online education, we know that a person’s position in the network influences their retention (Eckles & Stradley, 2015). A person who is less connected, is more likely to leave. Similarly, a person who has many friends who are leaving, is also more likely to leave (peer effect).

Finally, our thinking is also influenced by research on ecology. The following article on network biomimicry, written by my team member, describes how findings from ecology applies to communities.

We have started with a broad list of metrics that, based on our research, are indicators of community health. Over time, we have narrowed down this list to focus on metrics that are intuitive and actionable.

In our published framework under section 5 we mention one of our metrics: Small world metric. This metric provides insides into the balance between cliques and bridges a community has. As alluded in the reply to the previous question, we are also measuring fragmentation. As we do not (yet) provide member profiles, we do not use node level (individual) social network metrics, like degree centrality or brokerage roles.

Please let me know if you have further questions or comments.

katerinabc · December 21, 2022, 4:43pm

@bordumb and @shelb_ee I understand that things are moving at Radcile (Transition to DAO, testing Otterspace for grant voting) and that you have a lot on your plate.

From my outside perspective, the health check we are currently providing (which is smaller than what I proposed here), will give you two metrics that be very helpful right now (the health check contains more than these two, I’m just highlighting them)

Social decentralization: Is “informal influence” centralized or not?
Informal influence refers to the influence members have because they talk with someone. While interactions in Discord are open to everyone, the act of talking with someone (addressing their comments, mentioning their name) influences them. This influence can be positive (becoming friends, trusting them) or negative (being on alert for everything they say and do).
Fragmentation: How much interaction is happening between groups of people?
Currently, groups of people are computed using a bottom-up process. If several members often talk with each other, they are a group. But this can be changed, by defining a group as people who have a specific discord tag. This would require a bit of development work.

Let me know what you think about this more narrowly defined health check. Most of milestone 1 would be dropped.

bordumb · December 22, 2022, 9:09pm

Hi

For me, it’s a no.

The reasons are:

This sounds academically interesting, but I see too many practical pitfalls for it to be a killer tool at the moment.
Firstly, we are currently at roughly 40 contributors and maybe a dozen community members who pop in and out sporadically. So it’s not clear that we need any sort of expansive survey of our DAO, at the moment. IMO it would feel a bit forced at this point. It might make sense for us in 6-12 months, but not right now.
Secondly, there seem to be too many gaps that will arise from the fact that we use more than just our Radicle Discord. People talk on DMs, Telegram, people make discussions on Discourse, etc. I understand this Grant would be a small POC, so this is a smaller issue than the first one. But looking at it from a long term perspective, I don’t see the idea for this POC growing into something that provides a holistic measure of health.

In the short term, I’d recommend you shop this around to larger DAOs who may have more of a need to get an expansive survey of Discord activity.

I might be willing to change my mind if someone else in our community has a strong opinion about buying into this.

katerinabc · January 11, 2023, 11:46am

Thanks for your reply. Sorry for the late acknowledgement. Your arguments against it are valid, especially if the community meets in different places. The integration with other platforms would be ready in a couple of months. We would not touch DMs as I consider them “confidential spaces” and doing analytics on them is wrong IMO.

Regarding the team size, 40 is approaching a good size for social network analysis to make sense. We’re doing the analysis on our team, and are just below half your size. For example, we had a feeling that we are too fragmented, but wanted to get clarity on the size of the problem (how fragmented we are) and how it changes over time (grow/decline in fragmentation metric).

If you like to have a demo, I’m happy to walk you through it.