michelle: voice assistants + wikidata

Hello everyone,

I am seeking conversation and resources about a particular partnership between Amazon Alexa and Google Home’s usage of Wikidata.

Basically, wikidata is a publicly licensed and machine readible knowledge base that Google home and alexa use in order to provide answers to questions.

My approach to presenting/researching this topic isn’t set in stone, but I am interested in theorizing the effects of multi-billion dollar corporations having transcripts of millions of knowledge/ information seeking queries. Also interested in the obfuscation of labor behind publicly licensed data, which requires volunteer labor (wiki community). Again, very interested in conversing with folks about this topic.

I am currently drafting a conference proposal for (https://www.eventbrite.com/e/the-future-is-open-access-but-how-do-we-get-there-a-symposium-tickets-64613766515).

Michelle Nitto


Hi Michelle, this is an interesting topic that I haven’t heard anything about. Can you post the resources that you’re using as a starting point? I might be able to help connect you to people working on this.


Super interesting topic. And it makes totally sense they are exploiting volunteerism to maximize profit. I’m willing to chat, I’m no expert, but I’m happy to help.

You probably already saw this Wired article, but in case you didn’t: https://www.wired.com/story/inside-the-alexa-friendly-world-of-wikidata/

This event might interest you too: https://docs.google.com/forms/d/e/1FAIpQLSdKQic3Ahhc9PhzUYimUDZQDqIAq8MqTwmGr-aPB3NhCzOrpg/viewform


Wikimedia transparency around google/amazon usage of publicly licensed knowledge database (wikidata)

Privacy/ethical implications of multi-billion dollar companies (google + amazon) possessing voice transcripts of info/knowledge seeking queries + how is this partnership feeding into google’s ambition to know what we want to know before we ask google a q?

How is this (voice transcripts/ data) different than aggregated PII collected via google’s platforms?

Theorizing: what type of weird/unethical experiments/research can be conducted via data points found in these transcripts

Considering legacy of gendered harassment of domesticized technology or feminized voice assistants & unwaged work of gendered domestic labor - how does private companies usage (before compensation) of volunteer labor mirror these phenomenon?

(working) BIBLIO:

TTW “Hey Theory” Panel (2019)

Edward,. (2015). “Leveraging Wikidata To Gain A Google Knowledge Graph Result.” SearchEngineLand.

Hanika, T., Marx, M. & Stumme, G. (2019). “Discovering Implicational Knowledge in Wikidata”

Haase, P., et al. “Alexa, Ask Wikidata! Voice interaction with knowledge graphs using Amazon Alexa”

Lakshmana, R. (2019). “Amazon confirms it retains your Alexa voice recordings indefinitely.” Thenextweb.com

Mozilla. (2019). “Wikidata Gives wings to Open Knowledge.” Mozilla Internet Health Report.

Olson, P. & Roland, D. (2019). “Amazon’s Alexa will provide U.K. Users with Medical Information.” Wall Street Journal.

Perez, Sarah. (2012). “Wikipedia’s Next Big Thing: Wikidata, A Machine-Readable, User-Editable Database Funded By Google, Paul Allen And Others.” TechCrunch.

Simonite, Tom. (2019). “Inside the Alexa-friendly world of Wikidata” Wired.

Simonite, Tom. (2019). “Who’s Listening When You Talk to Your Google Assistant?” Wired.

Tanon, et al. (2016) “From Freebase to Wikidata: The Great Migration.” IW3C2.

Thanks @alison I’d really appreciate that. Check out my reply to my own thread for my working questions (i still don’t have a research Q) + working bibliography!

Thanks @_TJ. Maybe we can talk virtually or IRL next weekend

1 Like