In our series “Motee Minute”, we let our Motees have their say on the latest tech trends – unfiltered, frank, and opinionated. This week we’re hearing from Alberto. Alberto is a Solutions Architect and a Senior Project Owner at Motius. His main area of expertise is Artificial Intelligence (AI) and Data Science applications. Before joining Motius, Alberto worked at the Audi AI Lab in autonomous driving applications. Before that, he got a Ph.D. in machine learning for swarm robotics while working at the German Aerospace Center (DLR).
RAG is a hot topic in AI. It improves the quality of responses by giving AI access to domain-specific content [1]. Integrating state-of-the-art technologies in RAG and encrypted computing makes it possible for organizations to query sensitive information topics without revealing their search.
Together in the PoC, Motius and Roseman Labs built a pipeline that took a potentially sensitive (security related) question and used the pipeline to find relevant answers in a large dataset from the Dutch National Cyber Security Center. This pipeline ensured that the search question and answer support from the knowledge base remained secret throughout the process.
The project highlights how AI can address critical challenges in regulated industries, like law enforcement, high-tech manufacturing and healthcare, to use domain specific data in a privacy-centric way.
Use Case: Mitigating Cybersecurity Risks for Large Organizations
Imagine the Chief Information Security Officer (CISO) of a vital organization faces a pressing cybersecurity risk in one of their systems. The CISO seeks advice on the best mitigation strategies from the National Cyber Security Center (NCSC). However, asking such sensitive questions openly could reveal sensitive information about their potentially vulnerable systems. This project explores how encrypted computing enables organizations to ask such questions, without revealing them to the wider world.
The process begins by embedding the CISO's question into a vector representation on their own system, using a locally hosted state-of-the-art large language model (LLM). This embedding is encrypted using secure Multi-Party Computation (MPC) and sent to a secure matching engine hosted by the NCSC.
The engine identifies the most relevant documents from the NCSC's repository by comparing the encrypted vector to the document embeddings—without ever decrypting the query or revealing its contents. Through this the retrieved documents are then returned to the CISO or fed back to a language model, forming the basis for expert advice. In the latter case, the language model generates a better response with the additional context.
Demonstrating Success: Privacy, Accuracy, and Scalability
The collaboration between Roseman Labs and Motius achieved several key milestones, the success of which is particularly relevant as organizations increasingly adopt data-sharing and collaboration frameworks.
- Stronger privacy: The pipeline ensured that sensitive queries and document comparisons remain secure end-to-end, addressing privacy concerns for organizations in highly regulated industries that do not want to disclose their sensitive information.
- Accurate document retrieval: Using cosine distance under encryption–also referred to as fuzzy matching–the project demonstrated high accuracy in identifying relevant documents. This holds even within a large corpus given sufficient compute scale.
- Scalability: The solution proved that encrypted computing could handle queries in natural language, making it relevant for real-world applications.
The Role of RAG and the Evolving Data Stack
The project aligns with trends in the modern data stack, in which three critical layers are addressed.
- LLMs strengthen the semantic top layer, enabling people to interact with data in natural language.
- At the bottom layer of the data stack, foundational LLMs, RAG and larger context windows extend the application space and accuracy of AI for domain-specific applications.
- Lastly, the data stack extends across partners in the wider ecosystem, addressing the lateral layer for data collaboration. The EU concept of dataspaces is a great example [2]. Roseman Labs has coined the term encrypted dataspaces for secure collaboration between organizations that remains encrypted end-to-end [3].
RAG is emerging as a key capability and is an important functionality to improve AI, leveraging the larger context windows of modern LLMs to embed and retrieve relevant documents dynamically.
Transformative Impact for Government and Industry
The implications of this project extend far beyond cybersecurity, providing benefits across regulated industries.
In law enforcement, securely analyzing sensitive intelligence across agencies without compromising operational confidentiality is highly relevant. For example, matching encrypted crime data across jurisdictions to uncover patterns.
For healthcare, hospitals and researchers need to collaborate without exposing sensitive patient information. Providers need a platform for securely sharing clinical data to improve research and treatment quality.
Consider a patient with questions about a certain health issue that they are not comfortable asking their doctor directly (e.g. sexual health) but still want to get relevant information. They can ask the question confidentially via the secure platform and get answers based on publicly available data, in an easily accessible form.
In high-tech manufacturing, secure analytics on supply chain risks and intellectual property can provide new insights. Manufacturers can securely share and analyze encrypted data about production bottlenecks or cyber threats. Additionally, semiconductor fabs can optimize product yield and quality by finetuning machine parameters across a supply or production chain.
For data analytics teams building tools for these sectors, this solution offers a clear blueprint for integrating encrypted AI into the modern data stack. By harnessing RAG and encrypted computing, organizations can unlock powerful insights while safeguarding sensitive information.
The Path Forward
This project is a testament to the potential of encrypted AI to address real-world challenges. As larger context windows and encrypted computing continue to evolve, the opportunities for secure collaboration across industries will only grow. For governments and businesses navigating complex data privacy landscapes, such encryption solutions represent not just technological innovation, but a fundamental shift in how we think about trust, security, and collaboration in the age of AI.
[1] https://en.wikipedia.org/wiki/Retrieval-augmented_generation
[2] https://digital-strategy.ec.europa.eu/en/policies/data-spaces
[3] https://rosemanlabs.com/en/blogs/eu-vision-of-data-spaces-happening-in-the-netherlands