A Practical Guide to creating your own Stewardship License

We wrote this guide as part of Mozilla's Data Futures Lab.

Preface

It's impractical to make a License that works for every community and every context. Instead, we provide some principles, values, and examples to guide people to develop their own licenses to serve the purpose of the communities the licenses are meant to protect.

Open technologies and open data are incredibly important for creating a fair and just society and preventing centralization of power. Unfortunately, many marginalized communities don't have the resources that enable them to benefit from open. Furthermore, the concentration of wealth by a few individuals, corporations, and nations have given the wealthy an unprecedented scale to take advantage of open data and open technologies to further centralize and grow their wealth and power. Marginalized communities should understand the risks of adopting open licenses. Instead, they are encouraged to create their own licenses using this document as a guide.

Values & Principles

There should be a clear understanding of the values and principles that represent a community before deciding on or creating a license for that community. As Te Hiku Media are a Māori organization, we can share some of the principles of Māori Data Sovereignty, most of which have universal application.

Māori Data Sovereignty Principles

Te Mana Raraunga, the Māori Data Sovereignty Network, have a helpful and concise document on Māori Data Sovereignty principles. We summarize those principles below and offer examples for each.

RANGATIRATANGA - The Licensee has an inherent right to exercise control over its own data. This is the right to self determination, or sovereignty in general. An example of implementing this principle is giving users the option to opt-in rather than opt-out of data sharing practices.

WHAKAPAPA - Any decisions about the potential future use of the Licensed resources by the Licensee will be made in conjunction with the licensor with the key goal of Māori data governance to protect against future harm. The idea here is the recognition of an existing relationship and stake in the sources being licensed. It's also important to understand the "genealogy" of the data. Where did it come from? If you can keep track of that, then you know whom the data or software should benefit. A common English term being used that's similar to whakapapa in this context is provenance.

WHANAUNGATANGA

It is recognised by both Parties that individuals and organizations responsible for the creation, collection, analysis, management, access, security or dissemination of Māori data are accountable to the communities, groups and individuals from whom the data derive. Whanaungatanga is about understanding that the decisions you make as an individual may affect those who are closely associated with you, especially if you come from a marginalized group. For example, by giving your genetic information to Ancestry.com, you are also giving the genetic information of your close relatives to Ancestry.com. Did you seek the permission of your relatives before doing this?

KOTAHITANGA - Both Parties understand the need to build capacity and support the development of a Māori workforce to enable the creation, collection, management, security, governance and application of data. This is about ensuring the goals of the license are maintained. For example, we want to create more high value jobs for Māori in STEM, so by building our own "artificial intelligence" technologies in te reo Māori, we can ensure we're growing a tech economy that puts Māori first.

MANAAKITANGA - The collection, use and interpretation of any data through the utilization of the Licensed Software shall uphold the mana of Māori communities, groups and individuals.

This is about respecting the communities we're meant to serve.

KAITIAKITANGA | GUARDIANSHIP - Both Parties agree that all data relating to this Agreement shall be stored and transferred in such a way that it enables and reinforces the capacity of Māori to exercise kaitiakitanga over Māori data.

Ethics, tikanga, kawa (protocols) and mātauranga (knowledge) shall underpin the protection, access and use of Māori data.

Kaitiakitanga License

Kaitiaki is a Māori word without specific English translation, but its meaning is similar to the words guardian, protector, custodian, and stewardship. It removes the idea of "personal property" and instead focuses on protecting shared resources for the benefit of the communities that have the right to protect and access those resources. Our Kaitiakitanga License was created under the same idea to ensure we look after the data and technologies that we gather and collect. The license ensures that we use those resources following tikanga (culture protocols), and that benefits derived from those resources go back to the communities who provided the resources or who are the rightful custodians of those resources.

We didn't sit down one day and decide to create a license. Instead, the license embodies how we operate as an organization. Our organization prioritizes cultural protocols. Our day to day operations are guided by those protocols. We make decisions based on the values and principles of Māori. We are held accountable by the haukāinga, our local community, and our community let us know when we have failed them. We have decided to try to capture this way of operating into a "License" that can be used to further benefit our community in the context of the digital economy.

Below we present some real examples of our Kaitiakitanga License to help guide you in creating your own License.

Name

First and foremost, the name Kaitiakitanga should be reserved for the indigenous people of Aotearoa. We recommend you adopt a word that comes from the language of your people but has a similar meaning or values of Kaitiakitanga. One common English word we're hearing is stewardship, especially when it comes to looking after our environment. A "Stewardship License" may be appropriate for English speaking audiences. But ultimately you want the name to best represent your community.

Set The Scene

Your license should have a preamble. Tell people what the purpose of your license is and why it is the way it is. Be clear who the license is meant to serve. Here is an excerpt from the preamble to our license,

While we recognize the importance of open source technology, we're mindful that the majority of tangata whenua and other indigenous peoples may not have access to the resources that enable them to benefit from open source technologies. As tangata whenua, our ability to grow, develop, and innovate has been stymied through colonization. We must protect our ability to grow as tangata whenua. By simply open sourcing our data and knowledge, we further allow ourselves to be colonised digitally in the modern world.

Real World Examples

The license for our pronunciation app Rongo is aimed at protecting Māori data while making Māori data accessible to Māori for future research to help the revitalisation and preservation of te reo Māori. The machine learning model behind Rongo was made possible by more than 2500 people offering their voice data to Te Hiku Media for this very purpose.

Kaitiakitanga of Data Shared from Rongo

  • Te Reo Irirangi o Te Hiku o Te Ika (Te Hiku Media) are the kaitiaki of all data you share with us from the Rongo App.
  • Te Hiku Media may use the data you share with us to improve our services and models including but not limited to our pronunciation models(s) and our speech recognition model(s).
  • We will only use the data you share for purposes that align with the values and tikanga of the Kaitiakitanga License as well as the revitalisation, preservation, and promotion of te reo Māori.
  • We may use the data you share with us for Māori led research around pronunciation.

Kaitiakitanga of Data Processed

Rongo uses the Papa Reo API, which processes your data to enable features such as speech detection and pronunciation assessment. The Papa Reo API has its own Kaitiakitanga License which is upheld by Rongo. Rongo runs the Papa Reo API on device.

  • All recordings are collected and processed on device. We don’t collect any recordings or information about the recordings unless you explicitly share your recordings with us.
  • If you decide to share your recordings with us, you assign us as the kaitiaki of the data that you share. This includes the original recording as well as the metadata created from the recording when it was processed on your device. This metadata includes the scores that our model assigns to your recordings as well as your progress in the app.

The Whare Kōrero is a media platform where 3rd parties upload their content into a centralized place for Māori and New Zealand audiences to access their content. Platforms like YouTube require you to give non-exclusive, royalty free rights of your content to YouTube and its parent company so they may profit off of your content and create derived works, which include machine learning models or "AI." The Kaitiakitanga License in this context ensures the people who uploaded the content remain the Kaitiaki of that content. Te Hiku Media, the creators and protectors of the Whare Kōrero, have no rights to 3rd party content uploaded to the platform. But Te Hiku Media do have responsibilities to ensure that content is protected under the License..

Kaitiakitanga of Content

  • The kaitiakitanga of all data in the Whare Kōrero remains with the respective kaitiaki [3rd parties] that contributed that data to the whare [platform].
  • Each kaitiaki may make data available for download as deemed suitable by the tikanga associated with that data.
  • You [users of the Whare Kōrero app] may download data to your device via this app, but the kaitiakitanga remains with the kaitiaki. By downloading the data to your device, unless you have kaitiaki rights to the data (for example, the data may come from your whānau [family], hapu, or iwi [tribe], and you have rights to keep the data for your own use), you agree: a. not to extract the data from the app; and b. only access the data through the app.

How Corporations should License Indigenous Data

Firstly, our position is that corporations shouldn't have access to indigenous data; however, our own people put their data into Facebook, TikTok, and other non-indigenous owned platforms every day because we have no suitable alternatives. In cases where it's inevitable that corporations will be in possession of the data of marginalized groups, we advocate that those corporations pay royalties back to those communities.

A practical example is  that there are a number of non-indigenous companies selling indigenous languages as a service. Duolingo sells Hawaiian language learning as a service. Drops sells te reo Māori language learning as a service. It's expected that these companies profit from selling those services. For one, they need to pay for the operating costs of those services. But any for-profit business must also make a margin of profit from their operations. The challenge here is that both te reo Māori and ʻŌlelo Hawaiʻi were once illegal in their home lands. It was illegal to teach ʻŌlelo Hawaiʻi in public schools in Hawai'i until the 80s. Te Reo Māori was only recognised as a language of New Zealand in the 80s. The parents and grandparents of living Hawaiians and Māori today were beaten in schools for speaking their indigenous languages. The idea that foreign corporations can profit from these languages is a post-colonial kick in the face.

We believe adopting a license like the Kaitiakitanga license can allow Duolingo, Drops, and even Big Tech creating LLMs from our marginalized communities a way forward. In this instance, royalties from profits should be paid back to the communities from which the data was gathered. Rather than paying royalties to shareholders, why not pay royalties to the Hawaiian Language communities who are struggling to teach their language to their people? Hawaiians are houseless and living in tents while rich Americans are buying up land in Hawai'i, building their second homes, or in some cases their doomsday fortresses.

Here is a template for a Kaitiakitanga License for Commercial Use of Data

Commercial Use of Data

  • Royalties on profits generated from commercial use of data must be shared back with the communities from which the data were collected.
  • Royalties may only be received by non-profit organization(s) or charitable trust(s) (the Beneficiary). The Beneficiary must be a true representation of the community. The Beneficiary must be structured in a way that the community hold them accountable and the community have real representation on the leadership of the Beneficiary.

For example, a portion of the subscriptions that users pay for Duolingo could go to ʻAha Pūnana Leo if the users access Hawaiian Language on Duolingo. It is ultimately up to the licensor to determine in what way royalties should be paid back to communities.

Tech companies may argue it's impractical to share profits back to the communities whose data these Companies profit from, but that's untrue. If they truly believe in how great their ChatGPTs, spaceships, and other products are, then we clearly have the technology for a just and equitable society.

Non-Commercial Use

This is quite simple, but in the context of Kaitiakitanga, we don't think a completely non-commercial use makes sense. For example, maybe the community from which the data or knowledge was gathered can use the data or derived works of the resources for commercial use, but others may not. You can think of this as "affirmative action" in the context of profiting from marginalized data. Marginalized communities are often at the bottom of the socio-economic ladder. One way to help move Hawaiains out of tents and into houses is to level the playing field and give them economic opportunities. The Kaitiakitanga License could do just that.

One option is to give the Kaitiaki the power to decide who gets to profit:

The licensee may not use the licensed technology or data for commercial purposes without the permission from the licensor.

Another option is to give the community the right to profit from the resources. But the challenge is who decides who genuinely represents the community?

Only licensees who are a part of the community from which the data were gathered may use this API for commercial use. All other licensees cannot use this API for commercial use.

Prohibited Uses

Most important is ensuring the data or technologies aren't used against the very people it was meant to serve. Prohibited uses should be anything that goes against the values of the Kaitiakitanga License or against the values of the community your license is aimed at protecting.

After first describing the principles of the License, you can say:

In accordance with the principles above, the Licensee may not use the Licensed Software and related data for the following:

  • Surveillance
  • Tracking
  • Discrimination
  • Persecution
  • Unfairness

Nuances of Data, Software, and APIs

Data, Software, and API need not necessarily be licensed differently. Our Kaitiakitanga License is adapted to each of these contexts. But you do have to consider the nuances of each context to make sure you don't forget any important considerations.

APIs

When we set out to build our te reo Māori language tools, see https://papareo.io, we were building these for Māori. The reality is that many non-Māori reached out to us to use our API because they are in a privileged position where they can benefit by integrating Māori natural language processing tools into their business applications. Not anyone can just create an account and start using our API. Instead, we guard it and manually onboard anyone who wants to access it. This ensures that we are giving Māori priority access to this technology. When non-Māori approach us, we have to apply our values and aspirations of our Kaitiakitanga License to decide whether a non-Māori group should have the privilege to use the API.

The API itself has a Kaitiakitanga License, which all users must agree to. But part of the role as the kaitiaki of the technology is to ensure the people who use the API align with the License. Alternatively, anyone with a credit card can access OpenAPI's Whisper which now sells te reo Māori ASR as a service.

Software

Licensing software is a much more common practice especially for open-source software. Some feedback we've had from strong advocates of open source is that the Kaitiakitanga License is closed. The moment people need to "seek permission" they automatically turn away. Many indigenous peoples didn't just walk into a forest and cut trees down. Oftentimes there were protocols and rituals associated with taking resources from the public domain. Hawaiians would give prayers and offer gifts to nature before felling a tree which would be used for building voyaging canoes. Many religions and scientists described indigenous rituals as barbaric. But the simple act of seeking permissions bestows a level of responsibility and respect to the resource being sought. This is one way to ensure what we do is more respectful and inclusive. Indigenous peoples requiring people to ask permission before using their software makes a lot of sense. If anything, it creates an opportunity to share the story of stewardship.

Compatibility

If you want to ensure a broader use of your license, you will need to be "compatible" with common protocols. For example, The Linux Foundation offer Software Package Data Exchange (SPDX) which is a way to ensure your license is included and is machine readable in the software you distribute.

Conclusion

We aren't experts on licenses. We haven't studied many of the mainstream permissive or copyleft licenses. But we are practitioners of kaitiakitanga. We work daily in our communities. Regardless of our business "titles," we still have roles and responsibilities in our communities whether it means washing dishes at the marae, helping an elder fix her radio, or live streaming events of cultural significance. We see  how kaitiakitanga builds better communities in the regions of Aotearoa. It only makes sense that we use similar values to develop licenses to ensure a more equitable and inclusive digital society.