Creating an on-prem, lightweight AI research assistant

Mar 15

Over the past several weeks, I dove into a fascinating challenge: creating the Max Productivity Server—an on-prem, lightweight AI research assistant designed for efficiency, privacy, and scalability. My goal was straightforward yet ambitious: blend cutting-edge AI capabilities with robust automation to build a genuinely helpful personal research assistant.

As a product manager with 8 years of experience guiding transformative tools like WatchTower and SENTRY, and a developer leveraging automation technologies for over two decades (from AppleScript and XTension, through VBA and SQL, to Python), I approached the Max Productivity Server with strategic clarity: deliver immediate practical value while laying a foundation for growth.

One key innovation was implementing Retrieval-Augmented Generation (RAG), a method where an AI retrieves relevant information from a database before crafting a response. Inspired by recent research from Anthropic, I integrated their "Contextual Retrieval" technique, which significantly boosts the accuracy of RAG systems by combining dense vector embeddings and BM25 reranking methods. Anthropic’s method has been shown to enhance retrieval accuracy by up to 67%, a result I replicated within our more constrained, lightweight environment using the powerful automation platform n8n combined with vector databases such as Supabase and Qdrant.

Key Technical Highlights:

Contextual Retrieval Implementation: Integrated Anthropic's recommendations for enhancing RAG through advanced embedding strategies and metadata-enriched queries, substantially improving accuracy and relevance.
Vector Database Optimization: Experimented extensively with Supabase and Qdrant, optimizing for rapid and accurate data retrieval crucial for responsive user interactions.
Workflow Automation with n8n: Orchestrated sophisticated multi-step workflows that seamlessly integrated external AI APIs (OpenAI, ElevenLabs, Gooey.AI), significantly reducing manual intervention and enabling continuous, real-time learning and interaction.

Through agile iterations, technical pivots (e.g., shifting from initial Coqui TTS experiments to ElevenLabs for reliable voice generation), and leveraging real-world research, the Max Productivity Server became a practical showcase of how intelligent automation can transform productivity.

Reflecting on this project, it's clear how vital adaptability, rigorous problem-solving, and continuous learning are, qualities I’ve consistently relied on in my career in government, product management and data analytics.

What’s next? Continuing to refine Max’s capabilities and exploring more innovative approaches to automation and AI. I welcome thoughts, questions, or collaboration from anyone exploring similar intersections of product management, data analytics, and AI-driven automation.

#AI #Automation #ProductManagement #DataAnalytics #RAG #n8n #Anthropic #VectorDatabases #Innovation

Sean Lavigne

Creating an on-prem, lightweight AI research assistant

Key Technical Highlights:

How I Built an AI-Powered Network Monitoring System for a Non-Profit with $200 in Raspberry Pis

Building an AI Assistant, 80’s Style