Natural Language Sensor Fusion

$3.92 crowdfunded from 0 people

75%
average score over 1 application evaluations
Researching a method for synthesizing multiple natural language inputs from sensors into a coherent data framework, with minimal meaning loss, for better public opinion analysis and governance insights.

As natural language processing takes on increasing importance in computing, the task of synthesizing the data collected by multiple sensors taking readings on the same phenomenon becomes problematic. It is easy to create a multi-sensor array using sensors that take numerical measurements – such as a chemical sensor array – but when the data collected is encoded in natural language, there is no ready metric available for synthesizing multiple inputs without an attendant loss of meaning.

My intuition is that the solution to this problem is embedded in the problem, itself: the very same vector databases that enable large language models to interact with users in natural language can be understood as providing the framework needed to compare/synthesize natural-language data points with a minimal loss of meaning, insofar as they operate by projecting natural language into a complex system of mathematical representations thereof.

This intuition, however, does not reach the question of mechanism; hence, further research is required. Can an LLM "do the math" required for natural language sensor fusion – and if so, what kind of prompting is required? Can an LLM's "embeddings" be abstracted from the model itself, and used as the basis for a natural language sensor fusion process that does not otherwise involve the "source" LLM? Since some measure of meaning must be lost in the process of making natural language computable, is it possible to identify where and how these losses are likeliest to take place, and to develop ways of mitigating them?

This work has important implications, specifically in the area of public opinion research, which has become less and less reliable as older ways of "taking the temperature of the public" have struggled to consistently provide accurate readings. Good governance requires good data, and good data depends on functional sensors. In some ways, the problem of natural language sensor fusion is the problem of democracy itself: How can disparate perspectives be synthesized into a clear signal, while respecting the messiness of the discourse from which that signal emerges?

I am a PhD candidate in English at Brown University, where I'm finishing a dissertation on the impact of LLMs on language, literature, and literary studies, and a researcher and editor at BlockScience, where much of my work also involves thinking through the ways that artificial intelligence will impact the sociotechnical systems that increasingly define our world.

Natural Language Sensor Fusion History

People donating to Natural Language Sensor Fusion , also donated to

Developing open-source software for online privacy, the Tor Project combats tracking, surveillance, and censorship to promote human rights and freedom.
Developing a decentralized infrastructure, the project merges attestation networks and AI-enhanced community knowledge graphs to empower grassroots civic participation and forge a knowledge-sharing commons for societal transformation.
Ethereal Forest aims to create PDX DAO, a coalition of local decentralized organizations to enhance participation and resource-sharing in Portland, focusing on research, tool curation, community engagement, and global collaboration.
Open-source infrastructure for creating digital attestations on any subject, supporting trust-building online and onchain, with applications ranging from reputation to governance systems across multiple mainnets.
A platform offering detailed analytics and research on Ethereum L2 scaling solutions, focused on transparency, comparison, and community education through monitoring, governance participation, and organizing conferences.