<?xml version="1.0" encoding="utf-8"?><feed xmlns="http://www.w3.org/2005/Atom" ><generator uri="https://jekyllrb.com/" version="4.4.1">Jekyll</generator><link href="https://www.ombulabs.ai/blog/rss.xml" rel="self" type="application/atom+xml" /><link href="https://www.ombulabs.ai/blog/" rel="alternate" type="text/html" /><updated>2026-03-09T09:58:40-04:00</updated><id>https://www.ombulabs.ai/blog/rss.xml</id><title type="html">OmbuLabs Blog</title><subtitle>Custom AI Solutions</subtitle><author><name>OmbuLabs</name></author><entry><title type="html">Celebrating the Entrepreneurs Organization of Philadelphia with AI-Powered Branding</title><link href="https://www.ombulabs.ai/blog/eo-philadelphia-new-year-card-generator.html" rel="alternate" type="text/html" title="Celebrating the Entrepreneurs Organization of Philadelphia with AI-Powered Branding" /><published>2026-02-03T04:56:54-05:00</published><updated>2026-02-03T04:56:54-05:00</updated><id>https://www.ombulabs.ai/blog/eo-philadelphia-new-year-card-generator</id><content type="html" xml:base="https://www.ombulabs.ai/blog/eo-philadelphia-new-year-card-generator.html"><![CDATA[<p>As a year winds down, entrepreneurs often pause to reflect on the risks taken, the lessons learned,
the wins celebrated, and the setbacks overcome behind the scenes.</p>

<p>What if those moments could be captured visually in a way that felt personal, branded, and inspiring?</p>

<p>That is exactly what we set out to do with a <a href="https://eo.ombulabs.ai/">custom GenAI-powered card builder</a>
created for the <a href="https://www.eophiladelphia.com">Entrepreneurs Organization of Philadelphia</a>.</p>

<!--more-->

<h2 id="a-custom-ai-experience-for-eo-philadelphia">A Custom AI Experience for EO Philadelphia</h2>

<p>This project started as a small but meaningful side project for the
<a href="https://eonetwork.org">Entrepreneurs Organization</a> (EO) Philadelphia community.</p>

<p>Our founder, <a href="https://www.ombulabs.ai/blog/authors/etagwerker">Ernesto Tagwerker</a>, has been a part of
this community since ‘23.</p>

<p>The goal was simple:</p>

<ul>
  <li>
    <p>Celebrate entrepreneurs as 2025 came to a close</p>
  </li>
  <li>
    <p>Highlight real moments from their lives and businesses</p>
  </li>
  <li>
    <p>Reinforce community, growth, and optimism heading into 2026</p>
  </li>
</ul>

<p>Using GenAI, we transform real photos into branded, New Year-themed cards tailored specifically for
EO Philadelphia members.</p>

<p>The result is a polished, fun experience that feels personal, authentic, and aligned with the EO brand.</p>

<h2 id="built-fast-designed-for-impact">Built Fast, Designed for Impact</h2>

<p>Like many of the GenAI tools we build for clients in Philadelphia and beyond, this card builder followed
a focused, efficient workflow.</p>

<h3 id="stack">Stack</h3>

<p>The EO Philadelphia card builder was created using a modern, production-ready AI stack:</p>

<ul>
  <li>
    <p>Python and FastAPI for the backend service that handles the AI interactions.</p>
  </li>
  <li>
    <p>React with MaterialUI for the frontend interface.</p>
  </li>
  <li>
    <p>LlamaIndex for the orchestration of the LLM and image generation models.</p>
  </li>
  <li>
    <p>OpenAI’s <code class="language-plaintext highlighter-rouge">gpt-4o-mini</code> for message validation and Google’s <code class="language-plaintext highlighter-rouge">gemini-3-pro-image-preview</code> for the image generation.</p>
  </li>
</ul>

<p>Because we’ve built similar GenAI-powered tools before, we were able to launch this microsite in just
a few days without sacrificing quality, branding, or reliability.</p>

<h2 id="how-the-ai-workflow-works">How the AI Workflow Works</h2>

<p>Behind the scenes, the GenAI-powered card builder follows a simple but powerful flow.</p>

<p>The process begins with user-provided context: a New Year message and a real photo from the year that serves as the foundation for the card.</p>

<p>The <a href="https://www.ombulabs.ai/blog/tags/artificial-intelligence">Artificial Intelligence</a> solution is then
guided with EO Philadelphia’s branding and New Year themes, ensuring the output aligns with
the organization’s identity. Images are transformed into a cohesive, polished visual style while messages
are validated and seamlessly integrated into the generated design.</p>

<p>The result is a shareable, high-quality card that feels personal while reinforcing the EO Philadelphia identity.
This kind of workflow is adaptable across industries, from internal team celebrations to marketing campaigns
and community engagement tools.</p>

<p>The template we designed also included a short message to encourage our entrepreneur friends to reach out
and find out more about EO Philadelphia.</p>

<h2 id="why-playful-brand-focused-ai-matters">Why Playful, Brand-Focused AI Matters</h2>

<p>Not every AI project needs to be a massive internal system or complex automation pipeline. Some of the most
effective AI tools are brand-forward, emotionally resonant, and designed for sharing and connection.</p>

<p>Experiences like this EO Philadelphia card builder can strengthen community engagement, elevate brand
perception, create memorable marketing moments, and showcase innovation without overwhelming users.</p>

<p>At <a href="https://www.ombulabs.ai">OmbuLabs.ai</a>, we build GenAI products, helping startups and established
organizations alike explore practical AI use cases, build custom AI-powered applications,
and move from idea to launch quickly.</p>

<p>This project is just one example of how thoughtful, creative AI can support storytelling, marketing,
and community while staying authentic to a brand.</p>

<h2 id="looking-ahead">Looking Ahead</h2>

<p>As we start ‘26, we are excited to keep helping companies experiment, build, and innovate with AI.</p>

<p>Please feel free to give the <a href="https://eo.ombulabs.ai/">EO Card Generator</a> a try.</p>

<p><img src="/blog/assets/images/eo-philadelphia-new-year-card-generator-2026-interface.png" alt="Inteface for the Gen AI Card Generator Designed and Implemented for EO Philadelphia" /></p>

<p>If you’re interested in reading more about our solution, you can find the open source repository on GitHub:
<a href="https://github.com/ombulabs/eo-philadelphia-card-generator">EO Philadelphia Card Generator Source Code</a></p>

<p>Our founder created some fun images based on moments in his life over the last year like
visiting Disneyland with his kids, a date night with his wife, and even his cat.</p>

<p><img src="/blog/assets/images/eo-cards/ernesto-new-year-cards.jpg" alt="New Year Card" /></p>

<p>Have an idea for a branded GenAI experience?</p>

<p>Curious how generative AI could support your marketing, community, or internal tools?</p>

<p>👉 Reach out to <a href="https://www.ombulabs.ai/contact">OmbuLabs.ai</a>. We would love to explore what we can build together.</p>

<p>Here is to a wonderful 2026 with new challenges for entrepreneurs in Philadelphia! 🚀</p>]]></content><author><name>fionadl</name></author><category term="generative-ai" /><summary type="html"><![CDATA[As a year winds down, entrepreneurs often pause to reflect on the risks taken, the lessons learned, the wins celebrated, and the setbacks overcome behind the scenes. What if those moments could be captured visually in a way that felt personal, branded, and inspiring? That is exactly what we set out to do with a custom GenAI-powered card builder created for the Entrepreneurs Organization of Philadelphia.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/eo-ai-branding.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/eo-ai-branding.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Introducing the Rails Superhero Card Generator</title><link href="https://www.ombulabs.ai/blog/multi-modal-card-generator.html" rel="alternate" type="text/html" title="Introducing the Rails Superhero Card Generator" /><published>2025-12-16T17:56:54-05:00</published><updated>2025-12-16T17:56:54-05:00</updated><id>https://www.ombulabs.ai/blog/multi-modal-card-generator</id><content type="html" xml:base="https://www.ombulabs.ai/blog/multi-modal-card-generator.html"><![CDATA[<p>Ever felt like a superhero after solving a tricky bug or implementing a complex feature in your Rails application? What if you could capture that moment with your own custom superhero card?</p>

<p>We’re immortalizing those heroic coding moments with our new <strong>Rails Superhero Card Generator</strong>! This AI-powered tool creates personalized superhero cards featuring your photo and a catchy superhero name that reflects your coding prowess.</p>

<p>Navigate to the <a href="https://hero.fastruby.io/">Rails Superhero Card Generator</a>, tell it your superhero skills, upload a picture and generate your custom hero card!</p>

<!--more-->

<h2 id="how-it-works">How it Works</h2>

<p>The Rails Superhero Card Generator combines the power of large language models (LLMs) and image generation models to create custom cards from a provided picture.</p>

<p>It is a very small application with a pretty simple flow:</p>

<p><img src="/blog/assets/images/rails_cards/sequence_diagram.png" alt="Sequence Diagram" /></p>

<h3 id="stack">Stack</h3>

<p>The Rails Superhero Card Generator is built using:</p>

<ul>
  <li>Python and FastAPI for the backend service that handles the AI interactions.</li>
  <li>React with MaterialUI for the frontend interface.</li>
  <li>LlamaIndex for the orchestration of the LLM and image generation models.</li>
  <li>OpenAI’s <code class="language-plaintext highlighter-rouge">gpt-4o-mini</code> for generating superhero names and <code class="language-plaintext highlighter-rouge">gpt-image-1</code> for the image generation.</li>
</ul>

<p>You can see the code in the <code class="language-plaintext highlighter-rouge">rails-superhero-cards</code> repository on <a href="https://github.com/ombulabs/rails-superhero-cards">GitHub</a>.</p>

<h3 id="ai-workflow">AI Workflow</h3>

<p>The generator works with a pretty straightforward workflow:</p>

<p><img src="/blog/assets/images/rails_cards/workflow.png" alt="AI Workflow" /></p>

<p>The first step is judging whether the query coming from the user is fit for purpose or contains content we don’t want to process.
This is a simple classification task that we delegate to the LLM and provides a simple gateway to filter out inappropriate content.</p>

<p>Then we generate a superhero name based on the user’s input using a simple call to the LLM.</p>

<p>Next, we generate the superhero card image using the image generation model, providing it with the user’s picture and the user’s query as context.</p>

<p>Finally, the card image is created from the generated image and the superhero name, and returned to the user.</p>

<p>The workflow is implemented in the <code class="language-plaintext highlighter-rouge">backend/workflow.py</code> file, and is orchestrated using LlamaIndex.</p>

<h3 id="try-it-out">Try it Out!</h3>

<p>Not convinced yet? Here are the heroes behind our awesome team:</p>

<p><img src="/blog/assets/images/rails_cards/superhero_cards.png" alt="Superhero Cards" /></p>

<p>Check out the <a href="https://hero.fastruby.io/">Rails Superhero Card Generator</a>, upload your picture, and create your own superhero card today!</p>

<p>Have an idea to improve the generator? Found a bug? Feel free to open an issue or a pull request on the <a href="https://github.com/ombulabs/rails-superhero-cards">GitHub repository</a>.</p>

<p>Have an idea for another awesome AI-tool? Something else entirely? <a href="https://www.fastruby.io/#contactus">Reach out to us</a>!</p>]]></content><author><name>fionadl</name></author><category term="generative-ai" /><summary type="html"><![CDATA[Ever felt like a superhero after solving a tricky bug or implementing a complex feature in your Rails application? What if you could capture that moment with your own custom superhero card? We’re immortalizing those heroic coding moments with our new Rails Superhero Card Generator! This AI-powered tool creates personalized superhero cards featuring your photo and a catchy superhero name that reflects your coding prowess. Navigate to the Rails Superhero Card Generator, tell it your superhero skills, upload a picture and generate your custom hero card!]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/rails-superhero-cards.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/rails-superhero-cards.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Safely Leveraging AI: Privacy and Security Best Practices</title><link href="https://www.ombulabs.ai/blog/ai-graduated-privacy-and-safety.html" rel="alternate" type="text/html" title="Safely Leveraging AI: Privacy and Security Best Practices" /><published>2025-11-13T14:19:15-05:00</published><updated>2025-11-13T14:19:15-05:00</updated><id>https://www.ombulabs.ai/blog/ai-graduated-privacy-and-safety</id><content type="html" xml:base="https://www.ombulabs.ai/blog/ai-graduated-privacy-and-safety.html"><![CDATA[<p>As <a href="https://www.ombulabs.ai/blog/tags/artificial-intelligence">artificial intelligence</a> becomes increasingly integrated into business operations, organizations face a critical challenge:
how to leverage the power of Large Language Models (LLMs) while maintaining the privacy and security of sensitive data.</p>

<p>The benefits of AI are clear: increased productivity, automated workflows, enhanced decision-making, to cite a few.
But these advantages must be balanced against the risks of data exposure, compliance violations, and security breaches.</p>

<p>In this post, we’ll explore a graduated approach to AI adoption, starting with basic security practices and progressing to fully self-hosted solutions.
Each level offers increasing control over your data while requiring more investment and technical expertise.</p>

<p>You can choose which level best fits your organization’s needs, risk tolerance, and resources.</p>

<!--more-->

<h2 id="the-importance-of-responsible-ai-usage">The Importance of Responsible AI Usage</h2>

<p>Before diving into specific approaches, it’s essential to understand why AI privacy matters.
When team members use AI tools like ChatGPT, Claude, or Copilot, they may inadvertently share:</p>

<ul>
  <li>Proprietary code or algorithms</li>
  <li>Customer data and personally identifiable information (PII)</li>
  <li>Strategic business plans and financial information</li>
  <li>Trade secrets and competitive intelligence</li>
  <li>Internal communications and confidential documents</li>
</ul>

<p>Once this information is sent to an external AI service, you may lose control over how it’s used, stored, or potentially incorporated into model training.
Even if a provider claims not to use your data for training, data breaches, subpoenas, or changes in terms of service can expose your organization to unnecessary risk.</p>

<p>The key is to implement a strategy that matches your organization’s risk tolerance, technical capabilities, and budget while still enabling your team to benefit from AI capabilities.</p>

<h2 id="level-1-policy-based-approach-with-approved-providers">Level 1: Policy-Based Approach with Approved Providers</h2>

<p><strong>Best for:</strong> Small to medium organizations with limited technical resources, lower risk tolerance for sensitive data exposure.</p>

<p><strong>Investment required:</strong> Low (primarily time for policy development and training)</p>

<p>The foundation of any AI privacy strategy starts with clear policies and education.
This approach doesn’t require technical implementation but establishes guardrails for AI usage.</p>

<h3 id="key-components">Key Components:</h3>

<p><strong>1. Review and Approve AI Providers</strong></p>

<p>Not all AI providers handle data the same way. Start by researching the privacy policies of major providers:</p>

<ul>
  <li><strong>OpenAI (ChatGPT)</strong>: Offers enterprise plans with data processing agreements (DPAs) and commitments not to use customer data for training. Free and Plus tiers may use conversations for training unless opted out.</li>
  <li><strong>Anthropic (Claude)</strong>: Provides enterprise options with strong privacy commitments. Does not train on customer conversations in their API or enterprise products.</li>
  <li><strong>Google (Gemini)</strong>: Offers enterprise versions with data residency options and DPAs. Consumer versions may use data for model improvement.</li>
  <li><strong>Microsoft (Copilot)</strong>: Enterprise versions include data protection commitments and compliance certifications. Consumer versions have different terms.</li>
</ul>

<p>Create an approved list of providers and specific product tiers that meet your organization’s requirements. Document which features are approved and which are not.</p>

<p><strong>2. Develop Clear Usage Guidelines</strong></p>

<p>Create comprehensive guidelines that specify:</p>

<ul>
  <li>What types of information can and cannot be shared with AI tools</li>
  <li>Which AI tools are approved for different use cases</li>
  <li>How to anonymize or redact sensitive information before using AI</li>
  <li>Consequences for policy violations</li>
</ul>

<p>Example guidelines might include:</p>
<ul>
  <li>Use AI to draft general marketing copy or emails</li>
  <li>Ask AI for coding help with generic algorithms</li>
  <li>Never paste customer data, even for analysis</li>
  <li>Never share proprietary code or business logic</li>
  <li>Never input confidential strategic plans or financial data</li>
</ul>

<p><strong>3. Implement Training Programs</strong></p>

<p>Regular training ensures employees understand:</p>
<ul>
  <li>Why AI privacy matters</li>
  <li>How to identify sensitive information</li>
  <li>Techniques for using AI effectively without compromising security</li>
  <li>Real-world examples of data exposure incidents</li>
</ul>

<p><strong>4. Monitor and Audit</strong></p>

<p>Establish processes to:</p>
<ul>
  <li>Regularly review AI usage across the organization</li>
  <li>Update policies as new tools and risks emerge</li>
  <li>Conduct periodic audits of employee AI usage</li>
  <li>Gather feedback on policy effectiveness</li>
</ul>

<h3 id="limitations">Limitations:</h3>

<p>This approach relies heavily on employee compliance and doesn’t provide technical enforcement.
It’s vulnerable to human error and may not satisfy strict regulatory requirements.</p>

<h2 id="level-2-managed-ai-platforms-with-privacy-controls">Level 2: Managed AI Platforms with Privacy Controls</h2>

<p><strong>Best for:</strong> Organizations ready to invest in tools, needing better control and audit capabilities.</p>

<p><strong>Investment required:</strong> Medium (subscription costs, integration time)</p>

<p>The next step is to adopt platforms that provide centralized access to AI capabilities while offering enhanced privacy controls and administrative oversight.</p>

<h3 id="options">Options:</h3>

<p><strong>1. Enterprise AI Platforms</strong></p>

<p>Services like OpenAI’s Enterprise plan, Anthropic’s Claude for Enterprise, or Google’s Vertex AI provide:</p>

<ul>
  <li><strong>Data Processing Agreements (DPAs)</strong>: Legal commitments about how your data is handled</li>
  <li><strong>No training on your data</strong>: Guarantees that your inputs won’t be used to train models</li>
  <li><strong>Access controls</strong>: Admin panels to manage who can use AI and how</li>
  <li><strong>Audit logs</strong>: Track what’s being sent to AI services</li>
  <li><strong>Compliance certifications</strong>: SOC 2, GDPR, HIPAA compliance where applicable</li>
</ul>

<p><strong>2. AI Gateway Solutions</strong></p>

<p>Tools like Cloudflare AI Gateway, Portkey, or Devs.ai act as intermediaries between your team and AI providers:</p>

<ul>
  <li><strong>Request filtering</strong>: Block requests containing sensitive patterns (emails, API keys, etc.)</li>
  <li><strong>Rate limiting</strong>: Control costs and prevent excessive usage</li>
  <li><strong>Caching</strong>: Reduce costs by caching common queries</li>
  <li><strong>Analytics</strong>: Understand how AI is being used across your organization</li>
  <li><strong>Multi-provider support</strong>: Switch between AI providers without changing code</li>
</ul>

<p><strong>3. Secure AI Workspaces</strong></p>

<p>Platforms like Microsoft 365 Copilot or Google Workspace with Gemini integrate AI into existing productivity tools with:</p>

<ul>
  <li><strong>Data residency</strong>: Keep data within specific geographic regions</li>
  <li><strong>Tenant isolation</strong>: Your data stays within your organization’s environment</li>
  <li><strong>Existing security controls</strong>: Leverage your current identity and access management</li>
  <li><strong>Compliance alignment</strong>: Inherit compliance certifications from the platform</li>
</ul>

<h3 id="implementation-steps">Implementation Steps:</h3>

<ol>
  <li><strong>Assess your needs</strong>: Determine which AI capabilities your team requires</li>
  <li><strong>Evaluate providers</strong>: Compare privacy features, compliance certifications, and costs</li>
  <li><strong>Negotiate contracts</strong>: Ensure DPAs and service level agreements (SLAs) meet your requirements</li>
  <li><strong>Configure controls</strong>: Set up access policies, content filters, and audit logging</li>
  <li><strong>Migrate gradually</strong>: Start with low-risk use cases and expand as confidence grows</li>
  <li><strong>Train users</strong>: Ensure employees understand how to use the new platform</li>
</ol>

<h3 id="limitations-1">Limitations:</h3>

<p>While significantly more secure than consumer AI tools, you’re still sending data to external providers.
For highly sensitive data or strict regulatory environments, this may not be sufficient.</p>

<h2 id="level-3-custom-solutions-with-secure-platforms">Level 3: Custom Solutions with Secure Platforms</h2>

<p><strong>Best for:</strong> Organizations with technical teams, handling sensitive data, needing customization.</p>

<p><strong>Investment required:</strong> Medium to High (development time, infrastructure costs)</p>

<p>At this level, you build custom AI applications using secure platforms and APIs, giving you more control over data flow and processing.</p>

<h3 id="approaches">Approaches:</h3>

<p><strong>1. API-Based Custom Applications</strong></p>

<p>Build internal tools that use AI provider APIs with enhanced security, like custom-built guardrails and data sanitization tools.
This gives you control over what data is sent to the AI, what data is kept, and how responses are handled.</p>

<p><strong>2. Private AI Environments</strong></p>

<p>Use platforms that keep your data within your infrastructure:</p>

<ul>
  <li><strong>Azure OpenAI Service</strong>: Deploy OpenAI models in your Azure tenant with data isolation</li>
  <li><strong>AWS Bedrock</strong>: Access foundation models with data staying in your AWS environment</li>
  <li><strong>Google Cloud Vertex AI</strong>: Use AI models within your Google Cloud infrastructure</li>
</ul>

<p>These services provide:</p>
<ul>
  <li><strong>Virtual Private Cloud (VPC) deployment</strong>: Models run in your network</li>
  <li><strong>Customer-managed encryption keys</strong>: You control the encryption keys</li>
  <li><strong>Private endpoints</strong>: No data traverses the public Internet</li>
  <li><strong>Regional deployment</strong>: Keep data in specific geographic locations</li>
</ul>

<h3 id="implementation-considerations">Implementation Considerations:</h3>

<ul>
  <li><strong>Data classification</strong>: Identify what data can be sent externally vs. must stay internal</li>
  <li><strong>Sanitization pipelines</strong>: Implement robust data cleaning before AI processing</li>
  <li><strong>Audit trails</strong>: Log all AI interactions for compliance and security review</li>
  <li><strong>Access controls</strong>: Implement role-based access to AI features</li>
  <li><strong>Cost management</strong>: Monitor API usage and implement budgets</li>
  <li><strong>Fallback strategies</strong>: Plan for API outages or rate limiting</li>
</ul>

<h3 id="limitations-2">Limitations:</h3>

<p>You’re still dependent on external AI providers, though with more control. Costs can be significant, and you need technical expertise to build and maintain custom solutions.</p>

<h2 id="level-4-hybrid-approach-with-local-models">Level 4: Hybrid Approach with Local Models</h2>

<p><strong>Best for:</strong> Organizations with strong technical teams, handling very sensitive data, needing offline capabilities.</p>

<p><strong>Investment required:</strong> High (infrastructure, expertise, maintenance)</p>

<p>This approach combines external AI services for general tasks with locally-hosted models for sensitive operations.</p>

<h3 id="architecture">Architecture:</h3>

<p><strong>1. Task Classification</strong></p>

<p>Implement a routing system that determines where to process each request.</p>

<p><strong>2. Local Model Deployment</strong></p>

<p>Host smaller, specialized models on your infrastructure:</p>

<ul>
  <li><strong>Ollama</strong>: Easy local deployment of models like Llama, Mistral, or CodeLlama</li>
  <li><strong>vLLM</strong>: High-performance inference server for local models</li>
  <li><strong>Text Generation Inference (TGI)</strong>: Hugging Face’s inference server</li>
  <li><strong>LocalAI</strong>: OpenAI-compatible API for local models</li>
</ul>

<h3 id="infrastructure-requirements">Infrastructure Requirements:</h3>

<ul>
  <li><strong>GPU servers</strong>: For reasonable inference speed (NVIDIA A100, H100, or similar)</li>
  <li><strong>Model storage</strong>: Significant disk space for model weights (10GB-100GB+ per model)</li>
  <li><strong>Memory</strong>: Large RAM requirements (16GB-80GB+ depending on model size)</li>
  <li><strong>Monitoring</strong>: Track model performance, latency, and resource usage</li>
  <li><strong>Updates</strong>: Process for updating models as new versions are released</li>
</ul>

<h3 id="limitations-3">Limitations:</h3>

<p>Local models are typically less capable than frontier models like GPT-4 or Claude.
They require significant infrastructure investment and ongoing maintenance.
You’ll need ML expertise to optimize performance and troubleshoot issues.</p>

<h2 id="level-5-fully-self-hosted-ai-infrastructure">Level 5: Fully Self-Hosted AI Infrastructure</h2>

<p><strong>Best for:</strong> Large enterprises, highly regulated industries, organizations with strict data sovereignty requirements.</p>

<p><strong>Investment required:</strong> Very High (infrastructure, team, ongoing costs)</p>

<p>The most secure option is to host everything yourself: models, infrastructure, and supporting services.</p>

<h3 id="components">Components:</h3>

<p><strong>1. Complete Model Hosting</strong></p>

<p>Deploy and manage all AI models on your infrastructure:</p>

<ul>
  <li><strong>Model selection</strong>: Choose open-source models (Llama, Mistral, Falcon, etc.)</li>
  <li><strong>Inference infrastructure</strong>: Build scalable serving infrastructure</li>
  <li><strong>Model registry</strong>: Manage multiple models and versions</li>
  <li><strong>Load balancing</strong>: Distribute requests across multiple model instances</li>
</ul>

<p><strong>2. Supporting Infrastructure</strong></p>

<p>Build the complete stack:</p>

<ul>
  <li><strong>Vector databases</strong>: Self-hosted Postgres with pgvector, Qdrant, or Milvus</li>
  <li><strong>Embedding generation</strong>: Local embedding models for semantic search</li>
  <li><strong>Monitoring and observability</strong>: Track performance, costs, and usage</li>
  <li><strong>CI/CD pipelines</strong>: Automate model deployment and updates</li>
</ul>

<p><strong>3. Security and Compliance</strong></p>

<p>Implement comprehensive security:</p>

<ul>
  <li><strong>Network isolation</strong>: Air-gapped or strictly controlled network access</li>
  <li><strong>Encryption</strong>: At-rest and in-transit encryption for all data</li>
  <li><strong>Access controls</strong>: Fine-grained permissions and authentication</li>
  <li><strong>Audit logging</strong>: Complete audit trail of all AI interactions</li>
  <li><strong>Compliance</strong>: Meet industry-specific requirements (HIPAA, SOC 2, ISO 27001)</li>
</ul>

<h3 id="limitations-4">Limitations:</h3>

<p>This approach requires significant investment and expertise. You’re responsible for everything: model performance, uptime, security, compliance, and updates.
It’s only cost-effective for large organizations with substantial AI usage or strict requirements that can’t be met any other way.</p>

<h2 id="choosing-the-right-approach">Choosing the Right Approach</h2>

<p>The best approach for your organization depends on several factors:</p>

<h3 id="consider-level-1-2-if">Consider Level 1-2 if:</h3>
<ul>
  <li>You’re just starting with AI adoption</li>
  <li>You have limited technical resources</li>
  <li>Your data sensitivity is moderate</li>
  <li>You need quick implementation</li>
  <li>Budget is constrained</li>
</ul>

<h3 id="consider-level-3-4-if">Consider Level 3-4 if:</h3>
<ul>
  <li>You handle sensitive customer data</li>
  <li>You have technical teams available</li>
  <li>You need customization and control</li>
  <li>Compliance requirements are stringent</li>
  <li>You’re willing to invest in infrastructure</li>
</ul>

<h3 id="consider-level-5-if">Consider Level 5 if:</h3>
<ul>
  <li>You operate in highly regulated industries (healthcare, finance, defense)</li>
  <li>Data sovereignty is critical</li>
  <li>You have very large AI usage volumes</li>
  <li>You have the budget and expertise</li>
  <li>External dependencies are unacceptable</li>
</ul>

<h2 id="practical-steps-to-get-started">Practical Steps to Get Started</h2>

<p>Regardless of which level you choose, follow these steps:</p>

<ol>
  <li><strong>Assess your current state</strong>: Audit how AI is currently being used in your organization</li>
  <li><strong>Classify your data</strong>: Identify what data is sensitive and what isn’t</li>
  <li><strong>Define requirements</strong>: Determine your privacy, compliance, and functional needs</li>
  <li><strong>Start small</strong>: Begin with low-risk use cases and proven approaches</li>
  <li><strong>Measure and iterate</strong>: Track usage, costs, and incidents; adjust your approach</li>
  <li><strong>Plan for growth</strong>: Design your strategy to evolve as your needs change</li>
</ol>

<p>You can always start with a simpler approach and graduate to more complex solutions as your AI usage matures.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Integrating AI into your organization doesn’t require choosing between innovation and security.
By taking a graduated approach, starting with clear policies and approved providers, then progressing to managed platforms, custom solutions, and potentially self-hosted infrastructure, you can leverage AI’s benefits while maintaining control over your sensitive data.</p>

<p>The key is to match your approach to your organization’s specific needs, risk tolerance, and capabilities.
Start where you are, implement strong foundations, and evolve your strategy as your AI usage matures.</p>

<p>Check out our <a href="https://www.ombulabs.ai/blog/ai-readiness-transformation-assessment.html">AI Readiness &amp; Transformation Assessment</a> if you need help assessing your current state and planning your AI journey.</p>

<p>Remember: the goal isn’t to achieve perfect security at the cost of usability, but to find the right balance that enables your team to work effectively while protecting what matters most.</p>

<p>Want to explore how AI can transform your business while keeping your data secure? <a href="/#contact-us">Talk to us today!</a></p>]]></content><author><name>fionadl</name></author><category term="security-and-privacy" /><summary type="html"><![CDATA[As artificial intelligence becomes increasingly integrated into business operations, organizations face a critical challenge: how to leverage the power of Large Language Models (LLMs) while maintaining the privacy and security of sensitive data. The benefits of AI are clear: increased productivity, automated workflows, enhanced decision-making, to cite a few. But these advantages must be balanced against the risks of data exposure, compliance violations, and security breaches. In this post, we’ll explore a graduated approach to AI adoption, starting with basic security practices and progressing to fully self-hosted solutions. Each level offers increasing control over your data while requiring more investment and technical expertise. You can choose which level best fits your organization’s needs, risk tolerance, and resources.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/graduated-approach-to-privacy.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/graduated-approach-to-privacy.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">AI Readiness &amp;amp; Transformation Assessment for Development Teams</title><link href="https://www.ombulabs.ai/blog/ai-readiness-transformation-assessment.html" rel="alternate" type="text/html" title="AI Readiness &amp;amp; Transformation Assessment for Development Teams" /><published>2025-10-28T06:00:00-04:00</published><updated>2025-10-28T06:00:00-04:00</updated><id>https://www.ombulabs.ai/blog/ai-readiness-transformation-assessment</id><content type="html" xml:base="https://www.ombulabs.ai/blog/ai-readiness-transformation-assessment.html"><![CDATA[<p>The AI revolution is transforming how development teams work, and engineering leaders are grappling with critical questions:</p>

<blockquote>
  <p>How ready is our development team for AI?
Which AI tools will actually improve our engineering processes?
How do we integrate AI into our development workflows without disrupting productivity?</p>
</blockquote>

<p>At <a href="https://www.ombulabs.ai">OmbuLabs.ai</a>, we’ve been helping development teams navigate these challenges and see the common pitfalls that arise.
To address these challenges, we’ve developed a structured approach through our comprehensive <strong>AI Readiness &amp; Transformation Assessment for Development Teams</strong>.</p>

<p>This strategic evaluation helps engineering organizations understand their current AI maturity, identify the right opportunities for AI integration in development processes, and build a roadmap for successful AI transformation in their engineering workflows.</p>

<!--more-->

<h2 id="why-an-ai-readiness-assessment-matters-for-development-teams">Why an AI Readiness Assessment Matters for Development Teams</h2>

<p>The AI landscape is evolving rapidly, with new development tools and technologies emerging almost daily.
While the potential benefits for engineering teams are enormous (increased coding productivity, improved code quality, faster debugging, enhanced developer experience), the path to successful AI adoption in development processes is far from straightforward.</p>

<p>Many development teams make the mistake of jumping into AI tool implementation without a clear understanding of their workflows or a defined integration strategy, leading to:</p>

<ul>
  <li>Wasted resources on AI tools that don’t align with development workflows</li>
  <li>Security vulnerabilities from poorly configured AI coding assistants</li>
  <li>Developer resistance due to a lack of proper training and understanding of how to maximize AI tool benefits</li>
  <li>Inconsistent results from fragmented AI tool adoption across the team</li>
  <li>Compliance issues with code privacy and intellectual property concerns</li>
</ul>

<p>Our AI Readiness &amp; Transformation Assessment addresses these challenges by providing a structured, comprehensive evaluation of your development team’s AI potential and readiness.</p>

<h2 id="whats-included-in-our-development-team-ai-assessment">What’s Included in Our Development Team AI Assessment</h2>

<p>Our assessment is designed to give you a complete picture of where your development team stands today and where you can go with AI-powered development tools. Here’s what we evaluate:</p>

<h3 id="current-development-ai-maturity-analysis">Current Development AI Maturity Analysis</h3>

<p>We start by understanding your engineering team’s current relationship with AI development tools and automation:</p>

<ul>
  <li>Existing AI coding assistants and development tools currently in use</li>
  <li>Development infrastructure and CI/CD pipeline assessment</li>
  <li>Team skills and AI tool proficiency evaluations</li>
  <li>Current development workflows and processes that could benefit from AI</li>
  <li>Developer culture and readiness for AI tool adoption</li>
</ul>

<h3 id="strategic-development-ai-opportunity-identification">Strategic Development AI Opportunity Identification</h3>

<p>Not every AI development tool is right for every engineering team. We help you identify the most impactful opportunities:</p>

<ul>
  <li>High-value use cases specific to your development workflows, coding standards, and team practices</li>
  <li>Quick wins that can demonstrate immediate productivity improvements</li>
  <li>Development process optimization opportunities using AI and automation</li>
  <li>Developer experience enhancements through AI-powered coding solutions</li>
</ul>

<h3 id="development-tool-and-process-recommendations">Development Tool and Process Recommendations</h3>

<p>The AI development tool landscape is vast and constantly changing. We cut through the noise to recommend:</p>

<ul>
  <li>Specific AI coding assistants and development tools that align with your team’s needs and budget</li>
  <li>Integration strategies for seamless adoption into existing development workflows</li>
  <li>Vendor evaluation criteria to help you make informed decisions about AI development tools</li>
  <li>Custom vs. off-the-shelf AI solutions analysis for development processes</li>
  <li>Development stack recommendations for optimal AI tool performance</li>
</ul>

<h3 id="security-and-compliance-framework-for-development-teams">Security and Compliance Framework for Development Teams</h3>

<p>AI development tools introduce new security and compliance considerations. Our assessment covers:</p>

<ul>
  <li>Code privacy and intellectual property protection strategies for AI coding assistants</li>
  <li>Security best practices for AI development tool implementation</li>
  <li>Compliance requirements specific to your industry and codebase</li>
  <li>Risk assessment and mitigation strategies for AI-generated code</li>
  <li>Governance frameworks for responsible AI use in development processes</li>
</ul>

<h3 id="cost-benefit-analysis-for-development-teams">Cost-Benefit Analysis for Development Teams</h3>

<p>We provide a detailed financial analysis to help you make informed investment decisions:</p>

<ul>
  <li>Total cost of ownership for recommended AI development tools and licenses</li>
  <li>Cost optimization strategies to maximize development productivity value</li>
  <li>Resource requirements for successful AI tool implementation and training</li>
</ul>

<h2 id="our-development-team-assessment-process">Our Development Team Assessment Process</h2>

<h3 id="phase-1-discovery-and-current-development-state-analysis">Phase 1: Discovery and Current Development State Analysis</h3>

<p>We begin with comprehensive stakeholder interviews and development environment audits:</p>

<ul>
  <li>Engineering leadership interviews to understand development objectives and challenges</li>
  <li>Developer surveys to assess current AI tool knowledge and current behaviors</li>
  <li>Development infrastructure review of existing systems, CI/CD pipelines, and tooling</li>
  <li>Development workflow mapping of key coding, testing, and deployment processes</li>
</ul>

<h3 id="phase-2-strategic-ai-roadmap-development">Phase 2: Strategic AI Roadmap Development</h3>

<p>Based on our discovery findings, we identify and prioritize AI development opportunities and create a comprehensive roadmap for your development team’s AI transformation:</p>

<ul>
  <li>Phased implementation plan with clear development milestones</li>
  <li>AI development tool and process recommendations, including specific integration and configuration requirements</li>
  <li>Developer training and skill development requirements</li>
  <li>Change management strategy for smooth AI tool adoption across the development team</li>
  <li>Success metrics and KPIs for measuring development productivity improvements</li>
</ul>

<h3 id="phase-3-custom-configuration-and-development-integration-planning">Phase 3: Custom Configuration and Development Integration Planning</h3>

<p>For development teams ready to move forward, we provide full implementation support:</p>

<ul>
  <li>Custom configuration files for recommended AI development tools</li>
  <li>Integration specifications with existing development systems, IDEs, and CI/CD pipelines</li>
  <li>Security configuration templates and guidelines for AI development tools</li>
</ul>

<h2 id="development-process-ai-integration">Development Process AI Integration</h2>

<p>Our assessment focuses specifically on integrating AI into core development processes. We evaluate how to seamlessly incorporate AI into:</p>

<h3 id="core-development-workflows">Core Development Workflows</h3>

<ul>
  <li>Code generation and completion tools like GitHub Copilot, Cursor, Claude Code, or Augment</li>
  <li>Automated testing and quality assurance using AI-powered testing tools and test generation</li>
  <li>Code review and security scanning with AI assistance for faster, more thorough reviews</li>
  <li>Documentation generation and maintenance automation for code comments and README files</li>
  <li>Bug detection and resolution using AI analysis and debugging assistance</li>
  <li>Refactoring and code optimization with AI-powered suggestions and automated improvements</li>
</ul>

<h3 id="development-team-collaboration-and-management">Development Team Collaboration and Management</h3>

<ul>
  <li>Sprint planning and estimation with AI assistance for more accurate project planning</li>
  <li>Knowledge management and documentation search across codebases and internal documentation</li>
  <li>Developer onboarding and mentoring with AI-powered learning and guidance tools</li>
  <li>Code pattern recognition and best practice enforcement across the development team</li>
</ul>

<h2 id="custom-development-ai-solutions">Custom Development AI Solutions</h2>

<p>In addition to assessment and tool integration and configuration, our team can also identify opportunities and build custom AI-powered solutions tailored to your specific development needs.</p>

<p>We’ll help you identify where a custom development solution makes sense and develop it end-to-end, from ideation to deployment.</p>

<p>Custom development solutions can include:</p>

<ul>
  <li>RAG-powered knowledge bases for internal development documentation, coding standards, and architectural decisions</li>
  <li>AI code reviewers that specialize in your codebase, domain-specific requirements, and team coding standards</li>
  <li>Custom AI chatbots for developer support, onboarding, and technical question answering</li>
  <li>LLM-powered development services to automate manual and repetitive development processes</li>
  <li>AI-powered code migration tools for framework upgrades, refactoring, and technical debt reduction</li>
  <li>Custom development workflow automation using AI to streamline repetitive development tasks</li>
</ul>

<h2 id="getting-started-with-your-development-team-ai-assessment">Getting Started with Your Development Team AI Assessment</h2>

<p>Ready to unlock your development team’s AI potential? Our AI Readiness &amp; Transformation Assessment is designed for development teams of all sizes, from startup engineering teams looking to gain a competitive edge to established enterprise development organizations planning large-scale AI tool adoption.</p>

<h3 id="investment-and-timeline">Investment and Timeline</h3>

<p>Our comprehensive development team assessment is delivered over 4 weeks and includes:</p>

<ul>
  <li>Detailed assessment report with findings and development-specific recommendations</li>
  <li>Strategic roadmap with prioritized implementation phases for development workflows</li>
  <li>Follow-up consultation to address questions and refine development plans</li>
  <li>Custom configuration files and integration plans for development tools (if applicable)</li>
</ul>

<p>The investment for the full assessment is $24,000. Custom development solution development and tooling implementation are priced separately based on scope.</p>

<h2 id="why-ombulabs-for-your-ai-assessment">Why OmbuLabs for Your AI Assessment</h2>

<p>When it comes to navigating your development team’s AI transformation, you need more than just consultants, you need partners who understand both the technical complexities and the human side of change.</p>

<h3 id="deep-technical-expertise-in-ai-and-development">Deep Technical Expertise in AI and Development</h3>

<p>At <a href="https://ombulabs.ai">OmbuLabs.ai</a>, we don’t just talk about AI. We build it. Our team has hands-on experience developing custom AI solutions, from RAG-powered solutions, to multi-agent systems, to machine learning models that drive real business impact. We’ve worked with organizations big and small (take a look at our <a href="https://www.ombulabs.ai/case-studies">Case Studies</a>) to deliver production-ready AI solutions, and we bring that same practical expertise to every assessment.</p>

<p>We understand development workflows from the inside out because we’re developers ourselves. We know the challenges of integrating new tools into existing processes, the importance of developer experience, and how to balance innovation with stability.</p>

<h3 id="lean-agile-and-results-focused-approach">Lean, Agile, and Results-Focused Approach</h3>

<p>Our approach is built on solving real problems and driving real value. We don’t believe in one-size-fits-all solutions or unnecessary complexity. Instead, we focus on:</p>

<ul>
  <li><strong>The right problem to solve</strong>: We help you identify where AI will have the most impact on your development processes.</li>
  <li><strong>The most valuable insights to extract</strong>: We cut through the AI hype to recommend tools and strategies that align with your specific needs.</li>
  <li><strong>The right tool for the job</strong>: We’re technology-agnostic. We recommend solutions based on what works best for your team, not what’s trendy.</li>
</ul>

<h3 id="people-first-philosophy">People-First Philosophy</h3>

<p>Technology is only as good as the people using it. That’s why our assessments go beyond technical recommendations to address the human elements of AI adoption:</p>

<ul>
  <li>Developer training and skill development strategies</li>
  <li>Change management approaches that reduce resistance and increase buy-in</li>
  <li>Cultural readiness evaluation to ensure your team is set up for success</li>
</ul>

<p>When you work with OmbuLabs.ai, you’re collaborating with compassionate individuals who understand that successful AI transformation requires both technical excellence and empathy for the people affected by change.</p>

<h3 id="proven-track-record">Proven Track Record</h3>

<p>Our clients consistently praise our communication, responsiveness, and ability to deliver high-quality results. We’ve helped organizations across industries unlock the potential of their data and technology, and we bring that same commitment to excellence to every AI assessment.</p>

<p>We’re not just consultants who deliver a report and disappear. We’re partners invested in your success, offering ongoing support and flexible engagement models that adapt as your needs evolve.</p>

<h3 id="end-to-end-capabilities">End-to-End Capabilities</h3>

<p>Unlike firms that only assess or only implement, we offer the full spectrum of AI services. After your assessment, we can:</p>

<ul>
  <li>Configure and integrate recommended AI development tools</li>
  <li>Build custom AI solutions tailored to your specific needs</li>
  <li>Provide fractional AI and machine learning services to complement your team</li>
  <li>Support ongoing optimization and MLOps as your AI capabilities mature</li>
</ul>

<p>This means you get continuity from assessment through implementation, with a partner who already understands your context and goals.</p>

<h2 id="your-development-teams-ai-transformation-starts-here">Your Development Team’s AI Transformation Starts Here</h2>

<p>The question isn’t whether AI will transform software development; it’s whether your development team will be leading that transformation or struggling to catch up. Our AI Readiness &amp; Transformation Assessment gives you the strategic clarity and practical roadmap you need to succeed in the AI-driven development landscape.</p>

<p>Don’t let uncertainty hold your development team back from AI’s transformative potential. Let us help you assess your team’s readiness, identify your development opportunities, and build a roadmap for successful AI adoption in your engineering processes.</p>

<p>Ready to work with a team that combines deep technical expertise, practical experience, and a people-first approach? <a href="https://www.ombulabs.ai/contact">Contact us today</a> to schedule your AI Readiness &amp; Transformation Assessment.</p>]]></content><author><name>abizzinotto</name></author><category term="artificial-intelligence" /><summary type="html"><![CDATA[The AI revolution is transforming how development teams work, and engineering leaders are grappling with critical questions: How ready is our development team for AI? Which AI tools will actually improve our engineering processes? How do we integrate AI into our development workflows without disrupting productivity? At OmbuLabs.ai, we’ve been helping development teams navigate these challenges and see the common pitfalls that arise. To address these challenges, we’ve developed a structured approach through our comprehensive AI Readiness &amp; Transformation Assessment for Development Teams. This strategic evaluation helps engineering organizations understand their current AI maturity, identify the right opportunities for AI integration in development processes, and build a roadmap for successful AI transformation in their engineering workflows.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/ai-readiness.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/ai-readiness.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Takeaways from NERCOMP 2025 with a focus on AI in Higher Ed (Part 2)</title><link href="https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-02.html" rel="alternate" type="text/html" title="Takeaways from NERCOMP 2025 with a focus on AI in Higher Ed (Part 2)" /><published>2025-09-22T04:05:26-04:00</published><updated>2025-09-22T04:05:26-04:00</updated><id>https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-02</id><content type="html" xml:base="https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-02.html"><![CDATA[<p>I recently attended NERCOMP for the first time and I got to connect with IT professionals
in the higher ed space. This is the second article in a series collecting my personal takeaways
from some of the most interesting AI-related sessions at the conference.</p>

<p>The first article in the series was about <a href="/blog/nercomp-2025-AI-summary-part-01.html">local AI in higher ed</a>. This one is about a case study about account provisioning with AI at UCLA.</p>

<!--more-->

<h2 id="the-future-of-account-provisioning-with-ai-at-ucla">The Future of Account Provisioning with AI at UCLA</h2>

<p><img src="/blog/assets/images/nercomp-2025/student-support.jpg" alt="An Early AI Use Case at UCLA" /></p>

<p><a href="https://www.linkedin.com/in/annaahearn/">Anna Ahearn</a> and
<a href="https://www.linkedin.com/in/krithik-udayashankar-45bb71a9/">Krithik Udayashankar</a>
from <a href="https://dts.ucla.edu">UCLA IT Services</a> presented a solution that leaned
heavily on custom GPTs to optimize account provisioning at different departments of their
institution.</p>

<h3 id="problem">Problem</h3>

<p>UCLA’s provisioning workflows are complex, inconsistent, and poorly documented. This causes
delays, risk, and common mistakes when onboarding/offboarding people to their systems.</p>

<h3 id="vision">Vision</h3>

<p>The team had a vision of an <a href="https://www.ombulabs.ai/blog/tags/generative-ai">AI chatbot</a>
using custom GPTs to provide accurate answers to access-related questions, reduce administrative
overhead, and improve compliance and efficiency.</p>

<p>Their vision of a solution streamlined workflows and addressed common challenges in managing
access provisioning across departments.</p>

<h3 id="solution">Solution</h3>

<p>The team conducted stakeholder interviews to process-map current workflows and identify key
challenges. They then developed a domain-specific GPT chatbot tailored to address real scenarios,
such as providing guidance on requesting access to the
<a href="https://financialaid.ucla.edu/bruin-financial-aid">Bruin Financial Aid system</a> or updating a
user’s role.</p>

<p><img src="/blog/assets/images/ai-chatbot-steps.png" alt="Steps the team followed to create a custom Chat GPT bot" /></p>

<p>This approach ensured the chatbot could deliver accurate, context-aware responses to
streamline access provisioning tasks.</p>

<p>For the implementation, the team decided to use <a href="https://openai.com/index/introducing-gpts/">OpenAI’s custom GPTs</a>
feature.</p>

<h3 id="ethical-commitments">Ethical Commitments</h3>

<p>From the very beginning, the team prioritized adherence to
<a href="https://ai.universityofcalifornia.edu/_files/documents/ai-council-uc-responsible-ai-principles.pdf">UCLA’s Responsible AI Principles</a>:</p>

<p><img src="/blog/assets/images/UC-responsible-AI-principles.png" alt="UCLA Responsible AI Principles" /></p>

<ul>
  <li>
    <p>Appropriateness: The potential benefits and risks of AI and the needs and priorities of
those affected should be carefully evaluated to determine whether AI should be applied or prohibited.</p>
  </li>
  <li>
    <p>Transparency: Individuals should be informed when AI-enabled tools are being used.</p>
  </li>
  <li>
    <p>Accuracy, Reliability, and Safety: AI-enabled tools should be effective, accurate, and reliable
for the intended use and verifiably safe and secure throughout their lifetime.</p>
  </li>
  <li>
    <p>Fairness and Non-Discrimination: AI-enabled tools should be assessed for bias and discrimination.</p>
  </li>
  <li>
    <p>Privacy and Security: AI-enabled tools should be designed in ways that maximize privacy
and security of persons and personal data.</p>
  </li>
  <li>
    <p>Human Values: AI-enabled tools should be developed and used in ways that support the ideals of
human values, such as human agency and dignity, and respect for civil and human rights.</p>
  </li>
  <li>
    <p>Shared Benefit and Prosperity: AI-enabled tools should be inclusive and promote equitable
benefits (e.g., social, economic, environmental) for all.</p>
  </li>
  <li>
    <p>Accountability: The University of California should be held accountable for its development
and use of AI systems in service provision in line with the above principles.</p>
  </li>
</ul>

<p>They emphasized the importance of stakeholders understanding and trusting the system.</p>

<p>Co-creation was a key focus, which pushed the team to involve collaboration with diverse groups
to shape the solution.</p>

<p>Additionally, they worked diligently to mitigate bias, recognizing its potential impact on the
effectiveness and fairness of the
<a href="https://www.ombulabs.ai/blog/tags/artificial-intelligence">AI</a> system.</p>

<h3 id="future-outlook">Future Outlook</h3>

<p>The team envisions a future where their solution incorporates predictive access assignments,
enabling the system to anticipate user needs based on patterns and roles.</p>

<p>They aim to implement dynamic access control, allowing permissions to adjust in <em>real time</em> as
circumstances change.</p>

<p>At the same time, they aim to have a solution with automated deprovisioning that will streamline the removal of access when it is no longer needed, reducing risks and administrative overhead.</p>

<p>Additionally, they plan to leverage AI for adaptive cybersecurity, ensuring robust protection that evolves to counter emerging threats.</p>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>In reflecting on UCLA’s approach, it’s clear to me that the success of AI-powered solutions in higher
ed depends on more than just technical innovation. The human factors of trust, bias, and usability remain central to these solutions.</p>

<p>Systems must be designed not only to save people’s time, but also to empower the people who rely on them every day.</p>

<p>When stakeholders are involved in shaping solutions, and when transparency and ethical commitments are prioritized, users are more likely to trust and adopt new tools.</p>

<p>Ultimately, AI should make processes easier and faster, but it must also be easy to understand to those it serves.</p>

<p>The most impactful systems are those that enhance human capabilities, foster collaboration, and build confidence.</p>

<p>At the same time, higher ed organizations don’t need highly customized Generative AI solutions to solve
real problems.</p>

<p>For publicly available content, where copyright is not a big concern, custom GPTs on top of OpenAI can work in a cost-effective way and reduce time to market for your solution.</p>

<h2 id="futher-reading">Futher Reading</h2>

<ul>
  <li>
    <p><a href="https://events.educause.edu/nercomp-annual-conference/2025/agenda/the-future-of-account-provisioning-with-aienhanced-tools-for-university-staff-1">Abstract: The Future of Account Provisioning with AI-Enhanced Tools for University Staff</a></p>
  </li>
  <li>
    <p><a href="https://files.abstractsonline.com/CTRL/49/3/A11/E15/FFC/49A/296/BDB/805/263/EA5/17/a271_1.pdf">Slides: The Future of Account Provisioning with AI-Enhanced Tools for University Staff</a></p>
  </li>
  <li>
    <p><a href="https://ai.universityofcalifornia.edu/_files/documents/ai-council-uc-responsible-ai-principles.pdf">UCLA’s Responsible AI Principles</a></p>
  </li>
  <li>
    <p><a href="https://help.openai.com/en/articles/8554397-creating-a-gpt">Creating a Custom Chat GPT</a></p>
  </li>
</ul>]]></content><author><name>etagwerker</name></author><category term="artificial-intelligence" /><summary type="html"><![CDATA[I recently attended NERCOMP for the first time and I got to connect with IT professionals in the higher ed space. This is the second article in a series collecting my personal takeaways from some of the most interesting AI-related sessions at the conference. The first article in the series was about local AI in higher ed. This one is about a case study about account provisioning with AI at UCLA.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/nercomp-pt-2.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/nercomp-pt-2.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Takeaways from NERCOMP 2025 with a focus on AI in Higher Ed (Part 1)</title><link href="https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-01.html" rel="alternate" type="text/html" title="Takeaways from NERCOMP 2025 with a focus on AI in Higher Ed (Part 1)" /><published>2025-08-19T05:07:53-04:00</published><updated>2025-08-19T05:07:53-04:00</updated><id>https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-01</id><content type="html" xml:base="https://www.ombulabs.ai/blog/nercomp-2025-AI-summary-part-01.html"><![CDATA[<p>A few months ago I attended <a href="https://events.educause.edu/nercomp-annual-conference/2025">NERCOMP</a>
for the first time and I got to connect with IT professionals
in the higher ed space. This is the first article in a series collecting my takeaways
from some of the most interesting AI-related sessions at the conference.</p>

<p>I tried to focus on real-world examples of how higher ed professionals are thinking through and
applying AI and ML, with all their limitations, in day-to-day academic life.</p>

<h2 id="keep-your-data-close-and-your-ai-closer-local-ai-in-the-academy">Keep Your Data Close and Your AI Closer: Local AI in the Academy</h2>

<p>In their presentation, <a href="https://www.linkedin.com/in/gpetruzella/">Gerol Petruzella</a> &amp;
<a href="https://www.linkedin.com/in/trevor-murphy-4b99965/">Trevor Murphy</a> (Williams College)
presented 3 case studies that leveraged AI to solve real scenarios at their institution.</p>

<p>These were 3 privacy-first, cloud-free applications of local AI to support pedagogy,
accessibility, and student experimentation.</p>

<!--more-->

<h3 id="case-1--private-archive-analysis">Case 1: 📂 Private Archive Analysis</h3>

<p>In this case study, the team needed to analyze 822 historical syllabi for teaching patterns.</p>

<blockquote>
  <p>“The Rice Center for Teaching sought to use a set of historical Williams syllabi
(in a mixture of pdf and docx file formats) to discover patterns in pedagogical
design and practice, to inform support for incoming Williams faculty.
These documents are not publicly available, and so RCT sought a local (non-cloud)
retrieval augmented generation (RAG) application to preserve the privacy of the
source documents, while leveraging the power of a chat-based generative LLM to
efficiently discover and retrieve useful patterns from a large corpus in a timely way.”</p>
</blockquote>

<p>It was very important that the solution should keep sensitive internal documents entirely
offline, due to copyright restrictions.</p>

<p>More technical details:</p>

<ul>
  <li><strong>Tech Stack:</strong> <a href="https://ollama.com">Ollama</a> + <a href="https://docs.openwebui.com">OpenWebUI</a></li>
  <li><strong>Key Feature:</strong> Retrieval-Augmented Generation (RAG) with embedded citations</li>
  <li><strong>Embed Model</strong>: <a href="https://huggingface.co/mixedbread-ai/mxbai-embed-large-v1">mxbai-embed-large-v1</a></li>
  <li><strong>Generation Model</strong>: <a href="https://huggingface.co/microsoft/Phi-3-medium-128k-instruct">phi-3-medium-128k-instruct</a></li>
</ul>

<p>Both the project design and the proposed solution are appropriately crafted to respect
Williams College’s
<a href="https://oit.williams.edu/help-guides/computing-and-research/guidance-for-working-with-generative-ai-tools/">Guidance for Working with Generative AI Tools</a>.</p>

<h3 id="case-2-️-audio-transcription-with-openvino">Case 2: 🎞️ Audio Transcription with OpenVINO</h3>

<p>In this scenario, they needed to create accurate SRT files with no cloud tools,
ensuring copyright compliance and data control. The source of the content was films in
the Italian language.</p>

<blockquote>
  <p>“The OpenVINO toolkit expands the familiar open-source audio editing application
Audacity with 100% local (not cloud-based) AI capacities.”</p>
</blockquote>

<p>More technical details:</p>

<ul>
  <li><strong>Tech Stack:</strong> <a href="https://www.audacityteam.org">Audacity</a> with
<a href="https://www.audacityteam.org/download/openvino/">OpenVINO</a> plugins</li>
</ul>

<h3 id="case-3--ephbot--the-pocket-sized-genai-lab">Case 3: 🤖 EphBot – The Pocket-Sized GenAI Lab</h3>

<p><img src="/blog/assets/images/nercomp-2025/eph-bot-williams.jpg" alt="EphBot Device at Williams" /></p>

<p>In the last case, they wanted to allow students to explore LLM behavior offline. They
designed and implemented a solution that was easily portable and not resource intensive.</p>

<p>Part of the features allowed people to measure electricity usage to teach sustainability
and AI ethics.</p>

<p>One of the neat features was that you could specify the configuration you wanted to use
with the LLM (e.g. temperature) and you could set it to a value that could make it produce
more or less hallucinations.</p>

<blockquote>
  <p>“You can check out EphBot from the Equipment Loan Center in Sawyer Library. Plug it in,
connect your phone to the “ephbot” Wi-Fi hotspot it creates, point your browser at
ephbot.local:8080, and you’re ready to chat. EphBot’s interface is similar to other
AI chatbots, but it gives you more controls to tinker under the hood: settings,
prompts, and more.”</p>
</blockquote>

<p>More technical details:</p>

<ul>
  <li><strong>Tech Stack:</strong> Raspberry Pi 5</li>
  <li><strong>LLM:</strong> <a href="https://github.com/Mozilla-Ocho/llamafile">llamafile</a> + <a href="https://huggingface.co/Mozilla/OLMo-7B-0424-llamafile">OLMo LLM</a></li>
</ul>

<p>Hardware and Software Requirements:</p>

<ul>
  <li>128GB Samsung EVO Plus micro-SD card (flashed with Raspberry Pi OS)</li>
  <li>CanaKit MicroSD reader/USB-A adapter</li>
  <li>Laptop (with Raspberry Pi Imager application)</li>
  <li>Ethernet connection (for initial setup and llamafile download)</li>
  <li>Standard Pi peripherals (power cord, micro-HDMI cable, keyboard, mouse)</li>
</ul>

<h2 id="final-thoughts">Final Thoughts</h2>

<p>This presentation was thought-provoking and enlightening. It shows that there are many
freely available tools and models that can be used to host AI-powered solutions in your
own infrastructure with basic knowledge of AI and ML.</p>

<p>Attendees learned that you don’t need to use third party services like Chat GPT to
develop powerful AI solutions. This is crucial when working with copyrighted content
that needs to be protected from improper use by third parties.</p>]]></content><author><name>etagwerker</name></author><category term="artificial-intelligence" /><summary type="html"><![CDATA[A few months ago I attended NERCOMP for the first time and I got to connect with IT professionals in the higher ed space. This is the first article in a series collecting my takeaways from some of the most interesting AI-related sessions at the conference. I tried to focus on real-world examples of how higher ed professionals are thinking through and applying AI and ML, with all their limitations, in day-to-day academic life. Keep Your Data Close and Your AI Closer: Local AI in the Academy In their presentation, Gerol Petruzella &amp; Trevor Murphy (Williams College) presented 3 case studies that leveraged AI to solve real scenarios at their institution. These were 3 privacy-first, cloud-free applications of local AI to support pedagogy, accessibility, and student experimentation.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/nercomp-pt-1.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/nercomp-pt-1.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">AI Agents: Implementing the ReAct Pattern in Ruby</title><link href="https://www.ombulabs.ai/blog/react-agent.html" rel="alternate" type="text/html" title="AI Agents: Implementing the ReAct Pattern in Ruby" /><published>2025-07-28T13:51:23-04:00</published><updated>2025-07-28T13:51:23-04:00</updated><id>https://www.ombulabs.ai/blog/react-agent</id><content type="html" xml:base="https://www.ombulabs.ai/blog/react-agent.html"><![CDATA[<p>AI Agents are everywhere. Every day, tools, libraries, new use cases, and new products come out using and leveraging AI Agents. Several frameworks have been developed to make them easy to build, but what happens under the hood?</p>

<p>A very popular pattern for building AI Agents is the ReAct pattern, meaning Reasoning and Acting. The idea is to get large language models (LLMs) to reason about a problem in a manner analogous to how humans do, by breaking down the problem into smaller steps, reasoning about each step, using tools, and then acting on the results.</p>

<p>Let’s walk through the ReAct pattern and how we can use it to build a simple AI Agent that writes blog posts in Ruby.</p>

<!--more-->

<h2 id="the-react-pattern-explained">The ReAct Pattern Explained</h2>

<p>The ReAct pattern was first introduced by Yao et al. in their paper <a href="https://arxiv.org/abs/2210.03629">“ReAct: Synergizing Reasoning and Acting in Language Models”</a>. The key idea is to combine reasoning (thinking) and acting (doing) in a way that allows LLMs to solve complex tasks more effectively.</p>

<p>In a standard, one step answer generation, the user submits a query, the LLM is prompted to generate an answer, and the answer is returned directly to the user.</p>

<p><img src="/blog/assets/images/standard-generation.png" alt="Standard Generation Flow" /></p>

<p>This approach works well for simple questions, but struggles as the tasks become more complex, or when external resources are needed to further inform an answer.</p>

<p>Techniques to tackle reasoning and acting separately exist - chain-of-thought prompting and function calling, for example - but the combination of both steps into a single loop is what makes ReAct powerful.</p>

<p>ReAct combines both reasoning and acting, allowing the model to reason about the results of its actions and use that reasoning to inform its next steps. This is done by breaking down the task into smaller steps, reasoning about each step, using tools to interact with external systems and perform actions, and then reasoning about the results of those actions.</p>

<p><img src="/blog/assets/images/react-pattern.png" alt="ReAct Pattern Flow" /></p>

<p>In essence, the agent operates in a loop of:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>THOUGHT: Reason about the task and decide on the next action
ACTION: Call a tool or perform an action based on the reasoning
OBSERVATION: Receive the result of the action
</code></pre></div></div>

<p>Combining reasoning and acting allows for more complex tasks to be solved effectively.</p>

<h2 id="implementing-react-in-ruby">Implementing ReAct in Ruby</h2>

<p>To implement this, we’ll use the <a href="https://github.com/anthropics/anthropic-sdk-ruby">Anthropic Ruby SDK</a>, <a href="https://www.tavily.com/">Tavily</a> for web search and <a href="https://serper.dev/">Serper.dev</a> for image search.
You’ll need to set up API keys for all three of them. While Anthropic requires a 5 dollar purchase to get started, Tavily and Serper.dev have free tiers.</p>

<h3 id="setting-up">Setting Up</h3>

<p>Make sure you have the required gems in your <code class="language-plaintext highlighter-rouge">Gemfile</code>:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">source</span> <span class="s2">"https://rubygems.org"</span>

<span class="n">gem</span> <span class="s2">"anthropic"</span><span class="p">,</span> <span class="s2">"~&gt; 1.1.1"</span>
<span class="n">gem</span> <span class="s2">"dotenv"</span>
<span class="n">gem</span> <span class="s2">"faraday"</span>
</code></pre></div></div>

<p>You will also need all three API keys in your <code class="language-plaintext highlighter-rouge">.env</code> file:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ANTHROPIC_API_KEY=your_anthropic_api_key
SERPER_API_KEY=your_serper_api_key
TAVILY_API_KEY=your_tavily_api_key
</code></pre></div></div>

<h3 id="preparing-the-tools">Preparing the Tools</h3>

<p>Our agent will use three tools: web search, image search, and saving content to a file. Let’s define the web search tool:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># tools.rb</span>
<span class="nb">require</span> <span class="s2">"faraday"</span>

<span class="k">class</span> <span class="nc">Tools</span>
  <span class="k">def</span> <span class="nf">web_search</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">tavily_client</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"/search"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">req</span><span class="o">|</span>
      <span class="n">req</span><span class="p">.</span><span class="nf">body</span> <span class="o">=</span> <span class="p">{</span>
        <span class="ss">query: </span><span class="n">query</span><span class="p">,</span>
        <span class="ss">search_depth: </span><span class="s2">"advanced"</span>
      <span class="p">}</span>
    <span class="k">end</span>
    <span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>
    <span class="n">parsed</span><span class="p">[</span><span class="s2">"results"</span><span class="p">]</span> <span class="o">||</span> <span class="s2">"No relevant results found."</span>
  <span class="k">rescue</span> <span class="no">Faraday</span><span class="o">::</span><span class="no">Error</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Search failed: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">tavily_client</span>
    <span class="vi">@tavily_client</span> <span class="o">||=</span> <span class="no">Faraday</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="s2">"https://api.tavily.com"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">conn</span><span class="o">|</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">request</span> <span class="ss">:json</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Authorization"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"Bearer </span><span class="si">#{</span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"TAVILY_API_KEY"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"application/json"</span>
    <span class="k">end</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This function uses the Tavily API to perform a web search based on the query provided, returning the results or an error message if the search fails.
Next, let’s define the image search tool using Serper.dev:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># tools.rb</span>
<span class="nb">require</span> <span class="s2">"faraday"</span>

<span class="k">class</span> <span class="nc">Tools</span>
  <span class="k">def</span> <span class="nf">image_search</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">serper_client</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"/images"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">req</span><span class="o">|</span>
      <span class="n">req</span><span class="p">.</span><span class="nf">body</span> <span class="o">=</span> <span class="p">{</span> <span class="ss">q: </span><span class="n">query</span> <span class="p">}</span>
    <span class="k">end</span>
    <span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>
    <span class="n">first_image</span> <span class="o">=</span> <span class="n">parsed</span><span class="p">[</span><span class="s2">"images"</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span>
    <span class="p">{</span>
      <span class="ss">title: </span><span class="n">first_image</span><span class="p">[</span><span class="s2">"title"</span><span class="p">],</span>
      <span class="ss">url: </span><span class="n">first_image</span><span class="p">[</span><span class="s2">"imageUrl"</span><span class="p">]</span>
    <span class="p">}</span>
  <span class="k">rescue</span> <span class="no">Faraday</span><span class="o">::</span><span class="no">Error</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Image search failed: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">web_search</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">tavily_client</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"/search"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">req</span><span class="o">|</span>
      <span class="n">req</span><span class="p">.</span><span class="nf">body</span> <span class="o">=</span> <span class="p">{</span>
        <span class="ss">query: </span><span class="n">query</span><span class="p">,</span>
        <span class="ss">search_depth: </span><span class="s2">"advanced"</span>
      <span class="p">}</span>
    <span class="k">end</span>
    <span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>
    <span class="n">parsed</span><span class="p">[</span><span class="s2">"results"</span><span class="p">]</span> <span class="o">||</span> <span class="s2">"No relevant results found."</span>
  <span class="k">rescue</span> <span class="no">Faraday</span><span class="o">::</span><span class="no">Error</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Search failed: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">serper_client</span>
    <span class="vi">@serper_client</span> <span class="o">||=</span> <span class="no">Faraday</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="s2">"https://google.serper.dev"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">conn</span><span class="o">|</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">request</span> <span class="ss">:json</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"X-API-KEY"</span><span class="p">]</span> <span class="o">=</span> <span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"SERPER_API_KEY"</span><span class="p">)</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"application/json"</span>
    <span class="k">end</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">tavily_client</span>
    <span class="vi">@tavily_client</span> <span class="o">||=</span> <span class="no">Faraday</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="s2">"https://api.tavily.com"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">conn</span><span class="o">|</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">request</span> <span class="ss">:json</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Authorization"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"Bearer </span><span class="si">#{</span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"TAVILY_API_KEY"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"application/json"</span>
    <span class="k">end</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Now we have a way to search for images based on a provided query.</p>

<p>Finally, let’s define the tool to save content to a file:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># tools.rb</span>
<span class="nb">require</span> <span class="s2">"faraday"</span>

<span class="k">class</span> <span class="nc">Tools</span>
  <span class="k">def</span> <span class="nf">save_to_file</span><span class="p">(</span><span class="n">content</span><span class="p">,</span> <span class="n">filename</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">content</span><span class="p">.</span><span class="nf">nil?</span> <span class="o">||</span> <span class="n">content</span><span class="p">.</span><span class="nf">strip</span><span class="p">.</span><span class="nf">length</span> <span class="o">&lt;</span> <span class="mi">20</span>
      <span class="k">return</span> <span class="s2">"Save failed: content is too short or empty."</span>
    <span class="k">end</span>

    <span class="no">File</span><span class="p">.</span><span class="nf">open</span><span class="p">(</span><span class="n">filename</span><span class="p">,</span> <span class="s2">"w"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">file</span><span class="o">|</span>
      <span class="n">file</span><span class="p">.</span><span class="nf">write</span><span class="p">(</span><span class="n">content</span><span class="p">)</span>
    <span class="k">end</span>
    <span class="s2">"Content saved to </span><span class="si">#{</span><span class="n">filename</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">rescue</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Failed to save content: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">image_search</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">serper_client</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"/images"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">req</span><span class="o">|</span>
      <span class="n">req</span><span class="p">.</span><span class="nf">body</span> <span class="o">=</span> <span class="p">{</span> <span class="ss">q: </span><span class="n">query</span> <span class="p">}</span>
    <span class="k">end</span>
    <span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>
    <span class="n">first_image</span> <span class="o">=</span> <span class="n">parsed</span><span class="p">[</span><span class="s2">"images"</span><span class="p">][</span><span class="mi">0</span><span class="p">]</span>
    <span class="p">{</span> <span class="ss">title: </span><span class="n">first_image</span><span class="p">[</span><span class="s2">"title"</span><span class="p">],</span> <span class="ss">url: </span><span class="n">first_image</span><span class="p">[</span><span class="s2">"imageUrl"</span><span class="p">]</span> <span class="p">}</span>
  <span class="k">rescue</span> <span class="no">Faraday</span><span class="o">::</span><span class="no">Error</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Image search failed: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">web_search</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">tavily_client</span><span class="p">.</span><span class="nf">post</span><span class="p">(</span><span class="s2">"/search"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">req</span><span class="o">|</span>
      <span class="n">req</span><span class="p">.</span><span class="nf">body</span> <span class="o">=</span> <span class="p">{</span>
        <span class="ss">query: </span><span class="n">query</span><span class="p">,</span>
        <span class="ss">search_depth: </span><span class="s2">"advanced"</span>
      <span class="p">}</span>
    <span class="k">end</span>
    <span class="n">parsed</span> <span class="o">=</span> <span class="no">JSON</span><span class="p">.</span><span class="nf">parse</span><span class="p">(</span><span class="n">response</span><span class="p">.</span><span class="nf">body</span><span class="p">)</span>
    <span class="n">parsed</span><span class="p">[</span><span class="s2">"results"</span><span class="p">]</span> <span class="o">||</span> <span class="s2">"No relevant results found."</span>
  <span class="k">rescue</span> <span class="no">Faraday</span><span class="o">::</span><span class="no">Error</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="s2">"Search failed: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
  <span class="k">end</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">serper_client</span>
    <span class="vi">@serper_client</span> <span class="o">||=</span> <span class="no">Faraday</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="s2">"https://google.serper.dev"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">conn</span><span class="o">|</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">request</span> <span class="ss">:json</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"X-API-KEY"</span><span class="p">]</span> <span class="o">=</span> <span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"SERPER_API_KEY"</span><span class="p">)</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"application/json"</span>
    <span class="k">end</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">tavily_client</span>
    <span class="vi">@tavily_client</span> <span class="o">||=</span> <span class="no">Faraday</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">url: </span><span class="s2">"https://api.tavily.com"</span><span class="p">)</span> <span class="k">do</span> <span class="o">|</span><span class="n">conn</span><span class="o">|</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">request</span> <span class="ss">:json</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Authorization"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"Bearer </span><span class="si">#{</span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"TAVILY_API_KEY"</span><span class="p">)</span><span class="si">}</span><span class="s2">"</span>
      <span class="n">conn</span><span class="p">.</span><span class="nf">headers</span><span class="p">[</span><span class="s2">"Content-Type"</span><span class="p">]</span> <span class="o">=</span> <span class="s2">"application/json"</span>
    <span class="k">end</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>And now we have all of our tools defined. The last step in setting up our tools for the agent to use is to create function definitions for them.
In order to decide which tool to use, we’ll leverage function calling, which allows the LLM to interact with tools. You can read more about it in the <a href="https://docs.anthropic.com/en/docs/agents-and-tools/tool-use/overview">Anthropic Tool use with Claude documentation</a>.</p>

<p>Anthropic expects the function definitions to be in a specific JSON format, so Claude can understand which tools are available and what each tools does:</p>

<div class="language-json highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">{</span><span class="w">
  </span><span class="nl">"name"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w">
  </span><span class="nl">"description"</span><span class="p">:</span><span class="w"> </span><span class="s2">""</span><span class="p">,</span><span class="w">
  </span><span class="nl">"input_schema"</span><span class="p">:</span><span class="w"> </span><span class="p">{</span><span class="w">
    </span><span class="nl">"type"</span><span class="p">:</span><span class="w"> </span><span class="s2">"object"</span><span class="p">,</span><span class="w">
    </span><span class="nl">"properties"</span><span class="p">:</span><span class="w"> </span><span class="p">{},</span><span class="w">
    </span><span class="nl">"required"</span><span class="p">:</span><span class="w"> </span><span class="p">[]</span><span class="w">
  </span><span class="p">}</span><span class="w">
</span><span class="p">}</span><span class="w">
</span></code></pre></div></div>

<p>Here’s how we can define our tools:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">FUNCTION_DEFINITIONS</span> <span class="o">=</span> <span class="p">[</span>
  <span class="p">{</span>
    <span class="ss">name: </span><span class="s2">"web_search"</span><span class="p">,</span>
    <span class="ss">description: </span><span class="s2">"Search the web for a topic"</span><span class="p">,</span>
    <span class="ss">input_schema: </span><span class="p">{</span>
      <span class="ss">type: </span><span class="s2">"object"</span><span class="p">,</span>
      <span class="ss">properties: </span><span class="p">{</span>
        <span class="ss">query: </span><span class="p">{</span> <span class="ss">type: </span><span class="s2">"string"</span><span class="p">,</span> <span class="ss">description: </span><span class="s2">"The topic to search for"</span> <span class="p">}</span>
      <span class="p">},</span>
      <span class="ss">required: </span><span class="p">[</span><span class="s2">"query"</span><span class="p">]</span>
    <span class="p">}</span>
  <span class="p">},</span>
  <span class="p">{</span>
    <span class="ss">name: </span><span class="s2">"image_search"</span><span class="p">,</span>
    <span class="ss">description: </span><span class="s2">"Find an image related to the topic"</span><span class="p">,</span>
    <span class="ss">input_schema: </span><span class="p">{</span>
      <span class="ss">type: </span><span class="s2">"object"</span><span class="p">,</span>
      <span class="ss">properties: </span><span class="p">{</span>
        <span class="ss">query: </span><span class="p">{</span> <span class="ss">type: </span><span class="s2">"string"</span><span class="p">,</span> <span class="ss">description: </span><span class="s2">"The image search term"</span> <span class="p">}</span>
      <span class="p">},</span>
      <span class="ss">required: </span><span class="p">[</span><span class="s2">"query"</span><span class="p">]</span>
    <span class="p">}</span>
  <span class="p">},</span>
  <span class="p">{</span>
    <span class="ss">name: </span><span class="s2">"save_to_file"</span><span class="p">,</span>
    <span class="ss">description: </span><span class="s2">"Save the markdown blog post to a file"</span><span class="p">,</span>
    <span class="ss">input_schema: </span><span class="p">{</span>
      <span class="ss">type: </span><span class="s2">"object"</span><span class="p">,</span>
      <span class="ss">properties: </span><span class="p">{</span>
        <span class="ss">content: </span><span class="p">{</span> <span class="ss">type: </span><span class="s2">"string"</span><span class="p">,</span> <span class="ss">description: </span><span class="s2">"Markdown content of the blog post"</span> <span class="p">},</span>
        <span class="ss">filename: </span><span class="p">{</span> <span class="ss">type: </span><span class="s2">"string"</span><span class="p">,</span> <span class="ss">description: </span><span class="s2">"Markdown file name to save as (e.g., 'my-post.md')"</span> <span class="p">}</span>
      <span class="p">},</span>
      <span class="ss">required: </span><span class="sx">%w[content filename]</span>
    <span class="p">}</span>
  <span class="p">}</span>
<span class="p">]</span>
</code></pre></div></div>

<p>Now our tool set up is ready. We can use these definitions to call the tools from our agent.</p>

<h3 id="building-the-agent">Building the Agent</h3>

<p>Now that we have our tools defined, we can build the agent that will use them to write a blog post.</p>

<p>The agent will be triggered by an incoming query, which will be the topic of the blog post. We want to define a system prompt to guide the agent’s behavior. The system prompt will instruct the agent to reason about the topic, search for relevant information, find an image, and then write a markdown blog post.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">SYSTEM_PROMPT</span> <span class="o">=</span> <span class="o">&lt;&lt;~</span><span class="no">PROMPT</span><span class="sh">
  You are an agent that thinks step by step and uses tools to complete your task.

  Your task is to write a blog post on the given topic.

  The blog post must:
  - Start with a markdown H1 title (e.g. `# Hammer Head Sharks`)
  - Include a markdown image below the title (`![alt](url)`)
  - Include a few paragraphs of well-formatted markdown content

  Available tools:
  - web_search: to look up information on the topic and gather relevant content
  - image_search: to find an image related to the topic
  - save_to_file: to save the final blog post to a file

  You should:
  1. Use `web_search` to gather information about the topic (iterate until you have enough content)
  2. Use `image_search` to find a relevant image
  3. Format the content into a markdown blog post, with:
    - An H1 title
    - An image below the title, added as an HTML &lt;img&gt; tag with the height set to no more than 300
    - A few paragraphs of content
  4. Save the final post using `save_to_file`

  IMPORTANT: ️Do not call `save_to_file` until:
  - You have included both an image and a title
  - The content is complete
  - You are fully ready to save

  After saving, you may return a final message to the user confirming the post was saved.
</span><span class="no">PROMPT</span>
</code></pre></div></div>

<p>Our prompt is structured to provide the agent with information on the task to accomplish, the requirements for the result, the available tools, and what it should do.</p>

<p>Now let’s build our ReAct agent. First, we’ll create a class to encapsulate the agent and initialize the Anthropic client:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># agent.rb</span>
<span class="nb">require</span> <span class="s2">"dotenv/load"</span>
<span class="nb">require</span> <span class="s2">"anthropic"</span>

<span class="nb">require_relative</span> <span class="s2">"tools"</span>  <span class="c1"># Assuming the Tools class is defined in tools.rb</span>

<span class="k">class</span> <span class="nc">ReActAgent</span>
  <span class="k">def</span> <span class="nf">initialize</span>
    <span class="vi">@tools</span> <span class="o">=</span> <span class="no">Tools</span><span class="p">.</span><span class="nf">new</span>
    <span class="vi">@messages</span> <span class="o">=</span> <span class="p">[]</span>
  <span class="k">end</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">complete</span>
    <span class="n">client</span><span class="p">.</span><span class="nf">messages</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span>
      <span class="ss">model: </span><span class="s2">"claude-sonnet-4-20250514"</span><span class="p">,</span>
      <span class="ss">max_tokens: </span><span class="mi">1024</span><span class="p">,</span>
      <span class="ss">temperature: </span><span class="mf">0.0</span><span class="p">,</span>
      <span class="ss">system: </span><span class="no">SYSTEM_PROMPT</span><span class="p">,</span>
      <span class="ss">messages: </span><span class="vi">@messages</span><span class="p">,</span>
      <span class="ss">tools: </span><span class="no">FUNCTION_DEFINITIONS</span>
    <span class="p">)</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">client</span>
    <span class="vi">@client</span> <span class="o">||=</span> <span class="no">Anthropic</span><span class="o">::</span><span class="no">Client</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">api_key: </span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"ANTHROPIC_API_KEY"</span><span class="p">))</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>The <code class="language-plaintext highlighter-rouge">@messages</code> array will hold the conversation history, which we will use to keep track of the agent’s reasoning and actions.
In the <code class="language-plaintext highlighter-rouge">complete</code>, we also specify which model to use, temperature, the system prompt and the tools available to the agent.
In this example, the temperature is set to 0 so the model will be as deterministic as possible, meaning it will try to always return the same output for the same input. This keeps our test runs consistent.
A higher temperature would encourage the model to be more “creative”, so you might want to experiment with that and see what works best for your use case.</p>

<p>Let’s also define a method to call the right tool when the agent decides to use one:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">tool_call</span><span class="p">(</span><span class="n">tool_name</span><span class="p">,</span> <span class="n">params</span><span class="p">)</span>
  <span class="k">case</span> <span class="n">tool_name</span>
  <span class="k">when</span> <span class="s2">"web_search"</span>
    <span class="vi">@tools</span><span class="p">.</span><span class="nf">web_search</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:query</span><span class="p">])</span>
  <span class="k">when</span> <span class="s2">"image_search"</span>
    <span class="vi">@tools</span><span class="p">.</span><span class="nf">image_search</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:query</span><span class="p">])</span>
  <span class="k">when</span> <span class="s2">"save_to_file"</span>
    <span class="vi">@tools</span><span class="p">.</span><span class="nf">save_to_file</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:content</span><span class="p">],</span> <span class="n">params</span><span class="p">[</span><span class="ss">:filename</span><span class="p">]</span> <span class="o">||</span> <span class="n">filename</span><span class="p">)</span>
  <span class="k">else</span>
    <span class="s2">"Unknown tool"</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Now let’s define our <code class="language-plaintext highlighter-rouge">run</code> method, which will be responsible for processing the user’s query and generating the blog post.</p>

<p>First, we need to add the user’s query to the conversation history:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
  <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
    <span class="ss">role: </span><span class="s2">"user"</span><span class="p">,</span>
    <span class="ss">content: </span><span class="p">[</span>
      <span class="p">{</span> <span class="ss">type: </span><span class="s2">"text"</span><span class="p">,</span> <span class="ss">text: </span><span class="s2">"Write a markdown blog post on: </span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="s2">"</span> <span class="p">}</span>
    <span class="p">]</span>
  <span class="p">}</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Then we’ll start the agent loop. A ReAct agent will:</p>

<ul>
  <li>Generate a response from the LLM</li>
  <li>Check if the response contains a tool call</li>
  <li>If it does, call the tool and add the result to the conversation history</li>
  <li>Repeat until the agent has completed the task</li>
</ul>

<p>Anthropic returns responses as an array of messages. Once we get a response that contains only text and no tool calls, we’ll break the loop.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
  <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
    <span class="ss">role: </span><span class="s2">"user"</span><span class="p">,</span>
    <span class="ss">content: </span><span class="p">[</span>
      <span class="p">{</span> <span class="ss">type: </span><span class="s2">"text"</span><span class="p">,</span> <span class="ss">text: </span><span class="s2">"Write a markdown blog post on: </span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="s2">"</span> <span class="p">}</span>
    <span class="p">]</span>
  <span class="p">}</span>

  <span class="kp">loop</span> <span class="k">do</span>
    <span class="c1"># Prompt the model to generate a response based on the conversation history</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">complete</span>

    <span class="c1"># Check if the response contains only text blocks. If so, we can assume the agent has finished its task.</span>
    <span class="k">if</span> <span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">.</span><span class="nf">all?</span> <span class="p">{</span> <span class="o">|</span><span class="n">block</span><span class="o">|</span> <span class="n">block</span><span class="p">[</span><span class="ss">:type</span><span class="p">]</span> <span class="o">==</span> <span class="ss">:text</span> <span class="p">}</span>
      <span class="nb">puts</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">✅ Claude has finished. Exiting."</span>
      <span class="k">break</span>
    <span class="k">end</span>

    <span class="c1"># The "thought" in the reason step is the text block in the response. We'll add it to the conversation history.</span>
    <span class="nb">puts</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">💬 Thought: </span><span class="si">#{</span><span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="ss">:text</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span>

    <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
      <span class="ss">role: </span><span class="s2">"assistant"</span><span class="p">,</span>
      <span class="ss">content: </span><span class="n">response</span><span class="p">.</span><span class="nf">content</span>
    <span class="p">}</span>

    <span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">block</span><span class="o">|</span>
      <span class="c1"># Process tool calls in the response block</span>
      <span class="k">next</span> <span class="k">unless</span> <span class="n">block</span><span class="p">[</span><span class="ss">:type</span><span class="p">]</span> <span class="o">==</span> <span class="ss">:tool_use</span>

      <span class="n">tool_name</span> <span class="o">=</span> <span class="n">block</span><span class="p">[</span><span class="ss">:name</span><span class="p">]</span>
      <span class="n">params</span> <span class="o">=</span> <span class="n">block</span><span class="p">[</span><span class="ss">:input</span><span class="p">]</span>

      <span class="c1"># The "action" in the act step is the tool call with the necessary parameters.</span>
      <span class="c1"># Claude provides values for the parameters, so we can call the tool directly.</span>
      <span class="nb">puts</span> <span class="s2">"🔧 Action: </span><span class="si">#{</span><span class="n">tool_name</span><span class="si">}</span><span class="s2">(</span><span class="si">#{</span><span class="n">params</span><span class="si">}</span><span class="s2">)"</span>

      <span class="n">result</span> <span class="o">=</span> <span class="n">tool_call</span><span class="p">(</span><span class="n">tool_name</span><span class="p">,</span> <span class="n">params</span><span class="p">)</span>

      <span class="c1"># The result of the tool call is the observation that the agent can use to reason about the next steps.</span>
      <span class="c1"># We'll add the result to the conversation history.</span>
      <span class="nb">puts</span> <span class="s2">"📝 Observation: </span><span class="si">#{</span><span class="n">result</span><span class="p">.</span><span class="nf">to_s</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">120</span><span class="p">]</span><span class="si">}</span><span class="s2">..."</span>

      <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
        <span class="ss">role: </span><span class="s2">"user"</span><span class="p">,</span>
        <span class="ss">content: </span><span class="p">[</span>
          <span class="p">{</span>
            <span class="ss">type: </span><span class="s2">"tool_result"</span><span class="p">,</span>
            <span class="ss">tool_use_id: </span><span class="n">block</span><span class="p">[</span><span class="ss">:id</span><span class="p">],</span>
            <span class="ss">content: </span><span class="n">result</span><span class="p">.</span><span class="nf">to_s</span>
          <span class="p">}</span>
        <span class="p">]</span>
      <span class="p">}</span>
    <span class="k">end</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Putting it all together, our <code class="language-plaintext highlighter-rouge">ReActAgent</code> class looks like this:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># agent.rb</span>
<span class="nb">require</span> <span class="s2">"dotenv/load"</span>
<span class="nb">require</span> <span class="s2">"anthropic"</span>

<span class="nb">require_relative</span> <span class="s2">"tools"</span>

<span class="k">class</span> <span class="nc">ReActAgent</span>
  <span class="k">def</span> <span class="nf">initialize</span>
    <span class="vi">@tools</span> <span class="o">=</span> <span class="no">Tools</span><span class="p">.</span><span class="nf">new</span>
    <span class="vi">@messages</span> <span class="o">=</span> <span class="p">[]</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">run</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
    <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
      <span class="ss">role: </span><span class="s2">"user"</span><span class="p">,</span>
      <span class="ss">content: </span><span class="p">[</span>
        <span class="p">{</span> <span class="ss">type: </span><span class="s2">"text"</span><span class="p">,</span> <span class="ss">text: </span><span class="s2">"Write a markdown blog post on: </span><span class="si">#{</span><span class="n">query</span><span class="si">}</span><span class="s2">"</span> <span class="p">}</span>
      <span class="p">]</span>
    <span class="p">}</span>

    <span class="kp">loop</span> <span class="k">do</span>
      <span class="n">response</span> <span class="o">=</span> <span class="n">complete</span>

      <span class="k">if</span> <span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">.</span><span class="nf">all?</span> <span class="p">{</span> <span class="o">|</span><span class="n">block</span><span class="o">|</span> <span class="n">block</span><span class="p">[</span><span class="ss">:type</span><span class="p">]</span> <span class="o">==</span> <span class="ss">:text</span> <span class="p">}</span>
        <span class="nb">puts</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">✅ Claude has finished. Exiting."</span>
        <span class="k">break</span>
      <span class="k">end</span>

      <span class="nb">puts</span> <span class="s2">"</span><span class="se">\n</span><span class="s2">💬 Thought: </span><span class="si">#{</span><span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">[</span><span class="mi">0</span><span class="p">][</span><span class="ss">:text</span><span class="p">]</span><span class="si">}</span><span class="s2">"</span>

      <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
        <span class="ss">role: </span><span class="s2">"assistant"</span><span class="p">,</span>
        <span class="ss">content: </span><span class="n">response</span><span class="p">.</span><span class="nf">content</span>
      <span class="p">}</span>

      <span class="n">response</span><span class="p">.</span><span class="nf">content</span><span class="p">.</span><span class="nf">each</span> <span class="k">do</span> <span class="o">|</span><span class="n">block</span><span class="o">|</span>
        <span class="k">next</span> <span class="k">unless</span> <span class="n">block</span><span class="p">[</span><span class="ss">:type</span><span class="p">]</span> <span class="o">==</span> <span class="ss">:tool_use</span>

        <span class="n">tool_name</span> <span class="o">=</span> <span class="n">block</span><span class="p">[</span><span class="ss">:name</span><span class="p">]</span>
        <span class="n">params</span> <span class="o">=</span> <span class="n">block</span><span class="p">[</span><span class="ss">:input</span><span class="p">]</span>

        <span class="nb">puts</span> <span class="s2">"🔧 Action: </span><span class="si">#{</span><span class="n">tool_name</span><span class="si">}</span><span class="s2">(</span><span class="si">#{</span><span class="n">params</span><span class="si">}</span><span class="s2">)"</span>

        <span class="n">result</span> <span class="o">=</span> <span class="n">tool_call</span><span class="p">(</span><span class="n">tool_name</span><span class="p">,</span> <span class="n">params</span><span class="p">)</span>

        <span class="nb">puts</span> <span class="s2">"📝 Observation: </span><span class="si">#{</span><span class="n">result</span><span class="p">.</span><span class="nf">to_s</span><span class="p">[</span><span class="mi">0</span><span class="o">..</span><span class="mi">120</span><span class="p">]</span><span class="si">}</span><span class="s2">..."</span>

        <span class="vi">@messages</span> <span class="o">&lt;&lt;</span> <span class="p">{</span>
          <span class="ss">role: </span><span class="s2">"user"</span><span class="p">,</span>
          <span class="ss">content: </span><span class="p">[</span>
            <span class="p">{</span>
              <span class="ss">type: </span><span class="s2">"tool_result"</span><span class="p">,</span>
              <span class="ss">tool_use_id: </span><span class="n">block</span><span class="p">[</span><span class="ss">:id</span><span class="p">],</span>
              <span class="ss">content: </span><span class="n">result</span><span class="p">.</span><span class="nf">to_s</span>
            <span class="p">}</span>
          <span class="p">]</span>
        <span class="p">}</span>
      <span class="k">end</span>
    <span class="k">end</span>
  <span class="k">end</span>

  <span class="kp">private</span>

  <span class="k">def</span> <span class="nf">tool_call</span><span class="p">(</span><span class="n">tool_name</span><span class="p">,</span> <span class="n">params</span><span class="p">)</span>
    <span class="k">case</span> <span class="n">tool_name</span>
    <span class="k">when</span> <span class="s2">"web_search"</span>
      <span class="vi">@tools</span><span class="p">.</span><span class="nf">web_search</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:query</span><span class="p">])</span>
    <span class="k">when</span> <span class="s2">"image_search"</span>
      <span class="vi">@tools</span><span class="p">.</span><span class="nf">image_search</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:query</span><span class="p">])</span>
    <span class="k">when</span> <span class="s2">"save_to_file"</span>
      <span class="vi">@tools</span><span class="p">.</span><span class="nf">save_to_file</span><span class="p">(</span><span class="n">params</span><span class="p">[</span><span class="ss">:content</span><span class="p">],</span> <span class="n">params</span><span class="p">[</span><span class="ss">:filename</span><span class="p">]</span> <span class="o">||</span> <span class="n">filename</span><span class="p">)</span>
    <span class="k">else</span>
      <span class="s2">"Unknown tool"</span>
    <span class="k">end</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">complete</span>
    <span class="n">client</span><span class="p">.</span><span class="nf">messages</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span>
      <span class="ss">model: </span><span class="s2">"claude-sonnet-4-20250514"</span><span class="p">,</span>
      <span class="ss">max_tokens: </span><span class="mi">1024</span><span class="p">,</span>
      <span class="ss">temperature: </span><span class="mf">0.0</span><span class="p">,</span>
      <span class="ss">system: </span><span class="no">SYSTEM_PROMPT</span><span class="p">,</span>
      <span class="ss">messages: </span><span class="vi">@messages</span><span class="p">,</span>
      <span class="ss">tools: </span><span class="no">FUNCTION_DEFINITIONS</span>
    <span class="p">)</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">client</span>
    <span class="vi">@client</span> <span class="o">||=</span> <span class="no">Anthropic</span><span class="o">::</span><span class="no">Client</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span><span class="ss">api_key: </span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"ANTHROPIC_API_KEY"</span><span class="p">))</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<h3 id="testing-the-agent">Testing the Agent</h3>

<p>To test our agent, we can create a simple script that initializes the agent and runs it with a query:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1"># blog_writer.rb</span>
<span class="nb">require_relative</span> <span class="s2">"agent"</span>  <span class="c1"># Assuming the ReActAgent class is defined in agent.rb</span>

<span class="k">if</span> <span class="kp">__FILE__</span> <span class="o">==</span> <span class="vg">$0</span>
  <span class="n">query</span> <span class="o">=</span> <span class="no">ARGV</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span>
  <span class="n">agent</span> <span class="o">=</span> <span class="no">ReActAgent</span><span class="p">.</span><span class="nf">new</span>
  <span class="n">agent</span><span class="p">.</span><span class="nf">run</span><span class="p">(</span><span class="n">query</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Let’s see what it has to say about hammerhead sharks:</p>

<div class="language-bash highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ruby blog_writer.rb <span class="s2">"Hammer head sharks"</span>
</code></pre></div></div>

<p>As the agent works through the task, we can see it’s reasoning step by step:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>💬 Thought: I'll help you write a blog post about hammerhead sharks. Let me start by gathering information about them and finding a relevant image.
🔧 Action: web_search({query: "hammerhead sharks facts characteristics behavior habitat"})
📝 Observation: [{"url" =&gt; "https://whi.swimtopia.com/fun-facts-about-hammerhead-sharks", "title" =&gt; "Fun Facts About Hammerhead Sharks!"...

💬 Thought: Now let me search for more specific information about hammerhead sharks to get a comprehensive understanding.
🔧 Action: web_search({query: "hammerhead shark species types great scalloped smooth bonnethead"})
📝 Observation: [{"url" =&gt; "https://en.wikipedia.org/wiki/Hammerhead_shark", "title" =&gt; "Hammerhead shark - Wikipedia", "content" =&gt; "| S...

💬 Thought: Now let me find a relevant image for the blog post.
🔧 Action: image_search({query: "hammerhead shark underwater swimming"})
📝 Observation: {title: "Do sharks hold their breath underwater? This species might ...", url: "https://i.natgeofe.com/n/22b41661-0a35-49...

💬 Thought: Perfect! Now I have enough information to write a comprehensive blog post about hammerhead sharks. Let me create the markdown content and save it to a file.
🔧 Action: save_to_file({filename: "hammerhead-sharks.md"})
📝 Observation: Save failed: content is too short or empty....

💬 Thought: Let me create the complete blog post content first:
🔧 Action: save_to_file({filename: "hammerhead-sharks.md", content: "# Hammerhead Sharks\n\n&lt;img src=\"https://i.natgeofe.com/n/22b41661-0a35-4933-8e8f-d53552a68705/National..."})
📝 Observation: Content saved to hammerhead-sharks.md...

✅ Claude has finished. Exiting.
</code></pre></div></div>

<p>You can see the agent reasoning about the task, searching for information, and acting on the result. You can also see, at the last step, the agent attempting to use a tool and failing (it did not provide the content of the blog post to save the file), and because it processes the result of a tool call, it understands it failed and tries again, then obtaining a successful result.</p>

<p>The generated post was saved in a file called <code class="language-plaintext highlighter-rouge">hammerhead-sharks.md</code> in the root directory, and you can see what the resulting blog post looks like below:</p>

<p><img src="/blog/assets/images/react-agent-blog-post.png" alt="ReAct Agent Blog Post Rendered" /></p>

<h2 id="conclusion">Conclusion</h2>

<p>AI Agents are powerful tools, and the ReAct pattern allows us to build agents that can reason about complex tasks and interact with external tools to accomplish them.
By combining reasoning and acting, we can create agents that can solve tasks in a way that is similar to how humans do, breaking down problems into smaller steps and using tools to interact with the world.</p>

<p>Want to know how we can help you leverage AI for your business? <a href="/#contact-us">Talk to us today!</a>.</p>]]></content><author><name>abizzinotto</name></author><category term="ai-agents" /><summary type="html"><![CDATA[AI Agents are everywhere. Every day, tools, libraries, new use cases, and new products come out using and leveraging AI Agents. Several frameworks have been developed to make them easy to build, but what happens under the hood? A very popular pattern for building AI Agents is the ReAct pattern, meaning Reasoning and Acting. The idea is to get large language models (LLMs) to reason about a problem in a manner analogous to how humans do, by breaking down the problem into smaller steps, reasoning about each step, using tools, and then acting on the results. Let’s walk through the ReAct pattern and how we can use it to build a simple AI Agent that writes blog posts in Ruby.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/ai-agents-react-pattern.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/ai-agents-react-pattern.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Implementing Semantic Search with Sequel and pgvector</title><link href="https://www.ombulabs.ai/blog/semantic-search.html" rel="alternate" type="text/html" title="Implementing Semantic Search with Sequel and pgvector" /><published>2025-06-03T16:17:00-04:00</published><updated>2025-06-03T16:17:00-04:00</updated><id>https://www.ombulabs.ai/blog/semantic-search</id><content type="html" xml:base="https://www.ombulabs.ai/blog/semantic-search.html"><![CDATA[<p>In my previous post, <a href="https://www.ombulabs.ai/blog/ai-assisted-marketing.html">An LLM-based AI Assistant for the FastRuby.io Newsletter</a>,
I introduced an AI-powered assistant we built with Sinatra to help our marketing team write summaries of blog posts for our newsletter.</p>

<p>In this post, I’ll go over how we implemented semantic search using <code class="language-plaintext highlighter-rouge">pgvector</code> and <code class="language-plaintext highlighter-rouge">Sequel</code> to fetch examples of previous summaries based on article content.</p>

<p>Semantic search allows our AI assistant to find the most relevant past examples, given meaning and context, when generating new summaries.
This helps ensure consistency in tone and style while providing context-aware results that will serve as better examples for the
large language modal (LLM) to generate new summaries, improving the quality of the generated output.</p>

<!--more-->

<h2 id="brief-introduction-to-semantic-search-and-cosine-distance">Brief Introduction to Semantic Search and Cosine Distance</h2>

<p>Semantic search is a technique used to find items in a database that are similar, contextually or conceptually, to a given query.
This means we don’t need to rely solely on exact keyword matches, and instead can find items that are related in meaning.</p>

<p>It “understands” meaning and context by converting text into high-dimensional vectors called embeddings. These embeddings
capture semantic relationships, and allow us to find conceptually related items by calculating distances between vectors.</p>

<p>Cosine distance is one of the most popular metrics for measuring the similarity between two vectors. It measures the cosine of the
angle between two vectors, capturing how similar their semantic directions are, regardless of their magnitudes.</p>

<p>Other metrics supported by <code class="language-plaintext highlighter-rouge">pgvector</code> include Euclidean distance, inner product, taxicab (or Manhattan distance), Hamming distance, and Jaccard distance.
So why not use one of those instead?</p>

<p>Euclidean distance is sensitive to magnitude, and can suffer from the curse of dimensionality, making it less effective for high-dimensional data like text embeddings.
Inner product is better suited for recommendation systems, it could give too much weight to frequently-used topics in our case, not yielding the best results for our summaries.
Taxicab (Manhattan) distance is similar to Euclidean but uses absolute differences, which can be less effective for high-dimensional data.
Hamming distance is used for binary vectors, which is not our case. Our embeddings are continuous, floating-point values.
Jaccard distance is also designed for binary or categorical data, not continuous embeddings.</p>

<p>Therefore, cosine distance is the most appropriate choice for our use case, as it effectively captures the semantic similarity between text embeddings.</p>

<h2 id="getting-started-with-pgvector-and-sequel">Getting Started with pgvector and Sequel</h2>

<p>To implement semantic search, we used <code class="language-plaintext highlighter-rouge">pgvector</code> to store and query vector embeddings in our PostgreSQL database, and <code class="language-plaintext highlighter-rouge">Sequel</code> as our ORM to interact with the database.</p>

<p>For <code class="language-plaintext highlighter-rouge">pgvector</code> to work, you need to have the <code class="language-plaintext highlighter-rouge">pgvector</code> extension installed in your system and enabled in your PostgreSQL database.</p>

<p>You can install the <code class="language-plaintext highlighter-rouge">pgvector</code> extension by following the instructions in the <a href="https://github.com/pgvector/pgvector">pgvector documentation</a>,
then make sure it is enabled in your database.</p>

<div class="language-sql highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">CREATE</span> <span class="n">EXTENSION</span> <span class="n">IF</span> <span class="k">NOT</span> <span class="k">EXISTS</span> <span class="n">vector</span><span class="p">;</span>
</code></pre></div></div>

<p>Now you can add it to your Sequel configuration as an extension:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s2">"sequel"</span>

<span class="no">Sequel</span><span class="p">.</span><span class="nf">extension</span> <span class="ss">:pgvector</span>  <span class="c1"># Extends the main Sequel module with pgvector functionality.</span>
<span class="no">DB</span> <span class="o">=</span> <span class="no">Sequel</span><span class="p">.</span><span class="nf">connect</span><span class="p">(</span><span class="no">ENV</span><span class="p">[</span><span class="s2">"DATABASE_URL"</span><span class="p">])</span>
<span class="no">DB</span><span class="p">.</span><span class="nf">extension</span> <span class="ss">:pgvector</span>  <span class="c1"># Extends the specific database connection with pgvector support.</span>
</code></pre></div></div>

<p>With the setup complete, we can now create a table to store our articles with their embeddings.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="no">Sequel</span><span class="p">.</span><span class="nf">migration</span> <span class="k">do</span>
  <span class="n">change</span> <span class="k">do</span>
    <span class="n">create_table</span><span class="p">(</span><span class="ss">:articles</span><span class="p">)</span> <span class="k">do</span>
      <span class="n">primary_key</span> <span class="ss">:id</span>
      <span class="no">String</span> <span class="ss">:title</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>
      <span class="no">String</span> <span class="ss">:content</span><span class="p">,</span> <span class="ss">text: </span><span class="kp">true</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>
      <span class="no">String</span> <span class="ss">:summary</span><span class="p">,</span> <span class="ss">text: </span><span class="kp">true</span>
      <span class="n">column</span> <span class="ss">:embedding</span><span class="p">,</span> <span class="s2">"vector(1536)"</span>
      <span class="no">String</span> <span class="ss">:embedding_model</span>
      <span class="no">DateTime</span> <span class="ss">:embedding_created_at</span>

      <span class="n">foreign_key</span> <span class="ss">:link_id</span><span class="p">,</span> <span class="ss">:links</span><span class="p">,</span> <span class="ss">null: </span><span class="kp">false</span>
    <span class="k">end</span>
    <span class="n">add_index</span> <span class="ss">:articles</span><span class="p">,</span> <span class="ss">:embedding</span><span class="p">,</span> <span class="ss">type: :ivfflat</span><span class="p">,</span> <span class="ss">opclass: :vector_cosine_ops</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This migration creates our <code class="language-plaintext highlighter-rouge">articles</code> table with a column to store the vector embeddings with a dimension of 1536.
The dimension of the vector is determined by the embedding model we use, in this case, OpenAI’s <code class="language-plaintext highlighter-rouge">ada-002</code> model, which produces 1536-dimensional embeddings.</p>

<p>The <code class="language-plaintext highlighter-rouge">embedding_model</code> and <code class="language-plaintext highlighter-rouge">embedding_created_at</code> columns are used to store metadata about the embedding.</p>

<p>For the index, <code class="language-plaintext highlighter-rouge">pgvector</code> supports both <code class="language-plaintext highlighter-rouge">ivfflat</code> and <code class="language-plaintext highlighter-rouge">hnsw</code> indexing methods. We chose <code class="language-plaintext highlighter-rouge">ivfflat</code> with the <code class="language-plaintext highlighter-rouge">vector_cosine_ops</code> operator class for cosine distance.
The <code class="language-plaintext highlighter-rouge">ivfflat</code> index requires less resources and has lower overhead. Our dataset is small (hundreds to lower thousands) and accuracy is good but not critical, so this is a good fit.</p>

<p>For interacting with the database and the embeddings, we created a model class using Sequel:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">require</span> <span class="s2">"pgvector"</span>

<span class="k">class</span> <span class="nc">Article</span> <span class="o">&lt;</span> <span class="no">Sequel</span><span class="o">::</span><span class="no">Model</span>
  <span class="n">plugin</span> <span class="ss">:pgvector</span><span class="p">,</span> <span class="ss">:embedding</span>

  <span class="k">def</span> <span class="nf">embedding</span>
    <span class="n">raw_value</span> <span class="o">=</span> <span class="nb">self</span><span class="p">[</span><span class="ss">:embedding</span><span class="p">]</span>
    <span class="k">return</span> <span class="kp">nil</span> <span class="k">unless</span> <span class="n">raw_value</span>

    <span class="n">raw_value</span>
  <span class="k">rescue</span> <span class="no">StandardError</span> <span class="o">=&gt;</span> <span class="n">e</span>
    <span class="nb">puts</span> <span class="s2">"Error retrieving embedding for article </span><span class="si">#{</span><span class="nb">id</span><span class="si">}</span><span class="s2">: </span><span class="si">#{</span><span class="n">e</span><span class="p">.</span><span class="nf">message</span><span class="si">}</span><span class="s2">"</span>
    <span class="kp">nil</span>
  <span class="k">end</span>

  <span class="k">def</span> <span class="nf">embedding?</span>
    <span class="o">!</span><span class="n">embedding</span><span class="p">.</span><span class="nf">nil?</span>
  <span class="k">rescue</span> <span class="no">StandardError</span>
    <span class="kp">false</span>
  <span class="k">end</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Due to <code class="language-plaintext highlighter-rouge">pgvector</code>’s type casting, if the value of the <code class="language-plaintext highlighter-rouge">embedding</code> column is <code class="language-plaintext highlighter-rouge">nil</code>, it throws an error when trying to access it.
The <code class="language-plaintext highlighter-rouge">embedding</code> method we defined ensures that, in those cases, we return <code class="language-plaintext highlighter-rouge">nil</code> instead of raising an error, allowing us to handle missing embeddings nicely.</p>

<h2 id="embedding-and-storing-content">Embedding and Storing Content</h2>

<p>As previously mentioned, we chose to use the OpenAI <code class="language-plaintext highlighter-rouge">ada-002</code> model to generate embeddings. We are using the <code class="language-plaintext highlighter-rouge">langchain.rb</code> library
to handle interactions with the OpenAI API, including generating embeddings.</p>

<p>With an OpenAI API key, the client can be initialized with the embedding model value set:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="vi">@client</span> <span class="o">=</span> <span class="no">Langchain</span><span class="o">::</span><span class="no">LLM</span><span class="o">::</span><span class="no">OpenAI</span><span class="p">.</span><span class="nf">new</span><span class="p">(</span>
  <span class="ss">api_key: </span><span class="no">ENV</span><span class="p">.</span><span class="nf">fetch</span><span class="p">(</span><span class="s2">"OPENAI_API_KEY"</span><span class="p">),</span>
  <span class="ss">default_options: </span><span class="p">{</span>
    <span class="ss">temperature: </span><span class="mf">0.7</span><span class="p">,</span>
    <span class="ss">chat_model: </span><span class="s2">"gpt-40"</span><span class="p">,</span>
    <span class="ss">embedding_model: </span><span class="s2">"ada-002"</span>
  <span class="p">}</span>
<span class="p">)</span>
</code></pre></div></div>

<p>Embedding the document becomes as simple call to the <code class="language-plaintext highlighter-rouge">.embed</code> method of the client:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">embed</span><span class="p">(</span><span class="n">doc</span><span class="p">)</span>
  <span class="n">response</span> <span class="o">=</span> <span class="vi">@client</span><span class="p">.</span><span class="nf">embed</span><span class="p">(</span><span class="ss">text: </span><span class="n">doc</span><span class="p">)</span>
  <span class="n">response</span><span class="p">.</span><span class="nf">embeddings</span><span class="p">.</span><span class="nf">first</span>
<span class="k">end</span>
</code></pre></div></div>

<p>And the embedding can be stored in the database with the other properties normally:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">embedding</span> <span class="o">=</span> <span class="n">embed</span><span class="p">(</span><span class="n">content</span><span class="p">,</span> <span class="n">link_id</span><span class="p">)</span>

<span class="no">Article</span><span class="p">.</span><span class="nf">create</span><span class="p">(</span>
  <span class="ss">link_id: </span><span class="n">link_id</span><span class="p">,</span>
  <span class="ss">title: </span><span class="n">title</span><span class="p">,</span>
  <span class="ss">content: </span><span class="n">content</span><span class="p">,</span>
  <span class="ss">embedding: </span><span class="n">embedding</span><span class="p">,</span>
  <span class="ss">embedding_model: </span><span class="n">ada</span><span class="o">-</span><span class="mo">002</span><span class="p">,</span>
  <span class="ss">embedding_created_at: </span><span class="no">Time</span><span class="p">.</span><span class="nf">now</span>
<span class="p">)</span>
</code></pre></div></div>

<h2 id="performing-semantic-search">Performing Semantic Search</h2>

<p>We can now perform semantic search to find articles that are closely related to the one in hand. In my previous article,
I showed a simple nearest neighbors search using cosine distance:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fetch_examples</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
  <span class="n">examples</span> <span class="o">=</span> <span class="n">article</span><span class="p">.</span><span class="nf">nearest_neighbors</span><span class="p">(</span><span class="ss">:embedding</span><span class="p">,</span> <span class="ss">distance: </span><span class="s2">"cosine"</span><span class="p">).</span><span class="nf">limit</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
  <span class="n">examples</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="ss">:summary</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Here, <code class="language-plaintext highlighter-rouge">article</code> is an instance of the <code class="language-plaintext highlighter-rouge">Article</code> model, and we’re retrieving the three most similar articles based on their embeddings using the <code class="language-plaintext highlighter-rouge">nearest_neighbors</code> method.</p>

<p>This method works, but it is limited. Here, we’ll get the three most similar articles, regardless of <em>how</em> similar they are.
Similarity is calculated using a distance metric, as explained above, but we are not taking that distance score into account.</p>

<p>To improve this, we can add a threshold to filter out articles that are not similar enough. The <code class="language-plaintext highlighter-rouge">pgvector</code> extension adds
specific operator for each distance metric, allowing us to filter results based on a minimum similarity score. For cosine distance, we can use the <code class="language-plaintext highlighter-rouge">&lt;=&gt;</code> operator:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fetch_examples</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
  <span class="n">examples</span> <span class="o">=</span> <span class="no">Article</span><span class="p">.</span><span class="nf">select</span><span class="p">(</span><span class="s2">"id"</span><span class="p">,</span> <span class="s2">"title"</span><span class="p">,</span> <span class="s2">"content"</span><span class="p">,</span>
    <span class="no">Sequel</span><span class="p">.</span><span class="nf">lit</span><span class="p">(</span><span class="s2">"1 - (embedding &lt;=&gt; '</span><span class="si">#{</span><span class="n">article</span><span class="p">.</span><span class="nf">embedding</span><span class="si">}</span><span class="s2">'::vector) AS similarity_score"</span><span class="p">))</span>
    <span class="p">.</span><span class="nf">where</span><span class="p">(</span><span class="no">Sequel</span><span class="p">.</span><span class="nf">lit</span><span class="p">(</span><span class="s2">"1 - (embedding &lt;=&gt; '</span><span class="si">#{</span><span class="n">article</span><span class="p">.</span><span class="nf">embedding</span><span class="si">}</span><span class="s2">'::vector) &gt;= ?"</span><span class="p">,</span> <span class="mf">0.75</span><span class="p">))</span>
    <span class="p">.</span><span class="nf">order</span><span class="p">(</span><span class="no">Sequel</span><span class="p">.</span><span class="nf">lit</span><span class="p">(</span><span class="s2">"embedding &lt;=&gt; '</span><span class="si">#{</span><span class="n">article</span><span class="p">.</span><span class="nf">embedding</span><span class="si">}</span><span class="s2">'::vector"</span><span class="p">))</span>
    <span class="p">.</span><span class="nf">limit</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
  <span class="n">examples</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="ss">:summary</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This query retrieves the three most similar articles to the given one, but only if their similarity score is above 0.75.</p>

<p>Now, we can reliably retrieve good examples of previous summaries that are contextually relevant to the article being summarized,
guaranteeing that poor matches are filtered out. This allows our AI assistant to provide better output when generating new summaries,
as the LLM has better examples to work with.</p>

<h2 id="conclusion">Conclusion</h2>

<p>Semantic search is a powerful technique that allows us to find contextually relevant items in our database, improving the quality of AI-generated content.</p>

<p>By using <code class="language-plaintext highlighter-rouge">pgvector</code> and <code class="language-plaintext highlighter-rouge">Sequel</code>, we can easily store and query vector embeddings, enabling us to perform similarity searches based on semantic meaning rather than just keywords.
These tools are open source and easy to use, making them a great choice for implementing semantic search in Ruby applications.</p>

<p>Want to know how we can help you leverage AI for your business? <a href="/#contact-us">Talk to us today!</a>.</p>]]></content><author><name>abizzinotto</name></author><category term="artificial-intelligence" /><summary type="html"><![CDATA[In my previous post, An LLM-based AI Assistant for the FastRuby.io Newsletter, I introduced an AI-powered assistant we built with Sinatra to help our marketing team write summaries of blog posts for our newsletter. In this post, I’ll go over how we implemented semantic search using pgvector and Sequel to fetch examples of previous summaries based on article content. Semantic search allows our AI assistant to find the most relevant past examples, given meaning and context, when generating new summaries. This helps ensure consistency in tone and style while providing context-aware results that will serve as better examples for the large language modal (LLM) to generate new summaries, improving the quality of the generated output.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/semantic-search-sequel-pgvector.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/semantic-search-sequel-pgvector.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">An LLM-based AI Assistant for the FastRuby.io Newsletter</title><link href="https://www.ombulabs.ai/blog/ai-assisted-marketing.html" rel="alternate" type="text/html" title="An LLM-based AI Assistant for the FastRuby.io Newsletter" /><published>2025-06-02T21:22:15-04:00</published><updated>2025-06-02T21:22:15-04:00</updated><id>https://www.ombulabs.ai/blog/ai-assisted-marketing</id><content type="html" xml:base="https://www.ombulabs.ai/blog/ai-assisted-marketing.html"><![CDATA[<p>Every other week, the <a href="https://www.fastruby.io/newsletter">FastRuby.io newsletter</a> brings a curated list of the best Ruby and Rails articles, tutorials, and news to your inbox.</p>

<p>Our engineering team collects links to interesting articles and our marketing team curates them, writes a summary for each article, and creates the newsletter.
This process is quite manual, and involves some back and forth to ensure summaries are accurate, engaging, and relevant to our audience.</p>

<p>To make if more efficient, we have developed an AI assistant that helps us curate articles and generate the summaries for the newsletter.</p>

<!--more-->

<h2 id="why-an-ai-assistant">Why an AI Assistant?</h2>

<p>We wanted a tool that could reduce the repetitive parts of the workflow without taking away the human touch that is essential for effective communication.
Summarizing a dozen articles every other week can be tedious and time-consuming, but it is necessary. We still want summaries that sound like us and highlight the right things.
Hence the AI assistant.</p>

<p>The AI assistant leverages a large language model (LLM) to analyze the content of the articles, extract key points, and generate concise summaries.
This helps our marketing team save some time and focus on the areas of the newsletter that require human creativity and judgment.</p>

<h2 id="the-stack">The Stack</h2>

<p>We wanted something that was easy to build, could be set up quickly and could also be used by our marketing team. This is an internal tool, and we just wanted something quick that works.</p>

<p>We chose to build the AI assistant using:</p>

<ul>
  <li><strong>Sinatra</strong>: to create a simple interface for our marketing team to interact with the AI assistant.</li>
  <li><strong>pgvector</strong>: to store and query vector embeddings of the article summaries.</li>
  <li><strong>Langchain.rb</strong>: to handle the interaction with the embedding model, the LLM, and to manage the workflow.</li>
</ul>

<p>For the embeddings, we used OpenAI’s <code class="language-plaintext highlighter-rouge">ada-002</code> model, which is well-suited for generating high-quality embeddings for text. For the LLM, we used OpenAI’s <code class="language-plaintext highlighter-rouge">gpt-40</code> model.</p>

<h2 id="how-it-works">How It Works</h2>

<p>To make it easy for our team to suggest links, we created a simple Slack integration that works through a Slack command. When a team member suggests a link, the AI assistant:</p>

<ul>
  <li>Fetches the article’s HTML content.</li>
  <li>Extracts the title and main content using <code class="language-plaintext highlighter-rouge">nokogiri</code> (a Ruby HTML parser).</li>
  <li>Does some minimal cleaning of the content to remove unnecessary elements.</li>
  <li>Embeds the content using the <code class="language-plaintext highlighter-rouge">ada-002</code> model to create a vector representation.</li>
  <li>Stores the title, content, and vector in a PostgreSQL database using <code class="language-plaintext highlighter-rouge">pgvector</code>.</li>
  <li>Triggers the summary generation process.</li>
</ul>

<p>We’ll walk through the summary generation process in detail.</p>

<h2 id="summary-generation">Summary Generation</h2>

<p>Immediately after the article is added, the AI assistant generates a summary using the <code class="language-plaintext highlighter-rouge">gpt-40</code> model. First, it retrieves three examples from our database of previously generated summaries using similarity search with <code class="language-plaintext highlighter-rouge">pgvector</code>.
Performing a cosine similarity search on our <code class="language-plaintext highlighter-rouge">articles</code> table with <code class="language-plaintext highlighter-rouge">pgvector</code> is quite easy:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">fetch_examples</span><span class="p">(</span><span class="n">article</span><span class="p">)</span>
  <span class="n">examples</span> <span class="o">=</span> <span class="n">article</span><span class="p">.</span><span class="nf">nearest_neighbors</span><span class="p">(</span><span class="ss">:embedding</span><span class="p">,</span> <span class="ss">distance: </span><span class="s2">"cosine"</span><span class="p">).</span><span class="nf">limit</span><span class="p">(</span><span class="mi">3</span><span class="p">)</span>
  <span class="n">examples</span><span class="p">.</span><span class="nf">map</span><span class="p">(</span><span class="ss">:summary</span><span class="p">)</span>
<span class="k">end</span>
</code></pre></div></div>

<p>Here, <code class="language-plaintext highlighter-rouge">article</code> is an instance of the <code class="language-plaintext highlighter-rouge">Article</code> model, which has a <code class="language-plaintext highlighter-rouge">pgvector</code> column called <code class="language-plaintext highlighter-rouge">embedding</code>. The <code class="language-plaintext highlighter-rouge">nearest_neighbors</code> method retrieves the three most similar articles based on their embeddings.</p>

<p>Next, the AI assistant generates a summary using a generate and review strategy. It first generates a draft summary based on the article content and the examples retrieved.
Then, it reviews the draft against the examples and a set of instructions to ensure it aligns with our style and tone. If it does, it approves the draft. If it doesn’t, it provides feedback to be used to refine the draft.</p>

<p>The first draft is generated using a prompt with the following structure:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;&lt;~PROMPT
  [Context: What kind of assistant is this?]

  [Context: What will the assistant be looking at?]

  [Task]

  1. Instruction number 1
  2. Instruction number 2
  3. Instruction number 3

  [Call to action: What should the assistant do?]

  **Examples of past summaries:**
  #{examples.map { |ex| "- #{ex.strip}" }.join("\n")}

  **Blog Post:**

  *Title:* #{title.strip}

  *Content:*
  #{content.strip}

  Return your response in this JSON format. Return ONLY the JSON object.
  {
    "title": "...",
    "summary": "..."
  }
PROMPT
</code></pre></div></div>

<p>After the draft is generated, the AI assistant reviews it using a prompt structured like this:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;&lt;~PROMPT
  You are a critical editor, review the snippet below:

  **Title:**
  #{title}

  **Article Content:**
  #{content}

  **Summary:**
  #{summary}

  Compare this snippet to the tone, length and style of these examples:
  #{examples.map { |ex| "- #{ex.strip}" }.join("\n")}

  Is it:
  - Characteristic number 1
  - Characteristic number 2

  Does it:
  - Question number 1
  - Question number 2

  If the snippet is accurate and acceptable, respond ONLY with:
  {"approved": true}

  If it needs edits, respond ONLY with:
  {"approved": false, "feedback": "...", "revised_summary": "..."}
PROMPT
</code></pre></div></div>

<p>The generate function looks like this:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate_summary</span><span class="p">(</span><span class="n">url</span><span class="p">,</span> <span class="n">parsed_blog</span><span class="p">,</span> <span class="ss">max_attempts: </span><span class="mi">3</span><span class="p">)</span>
    <span class="k">raise</span> <span class="s2">"URL is required"</span> <span class="k">if</span> <span class="n">url</span><span class="p">.</span><span class="nf">nil?</span> <span class="o">||</span> <span class="n">url</span><span class="p">.</span><span class="nf">empty?</span>

    <span class="n">examples</span> <span class="o">=</span> <span class="n">fetch_examples</span><span class="p">(</span><span class="n">parsed_blog</span><span class="p">[</span><span class="ss">:content</span><span class="p">])</span>

    <span class="c1"># Generate the initial summary</span>
    <span class="n">summary</span> <span class="o">=</span> <span class="n">generate</span><span class="p">(</span><span class="n">parsed_blog</span><span class="p">[</span><span class="ss">:title</span><span class="p">],</span> <span class="n">parsed_blog</span><span class="p">[</span><span class="ss">:content</span><span class="p">],</span> <span class="n">examples</span><span class="p">)</span>

    <span class="c1"># Review the generated summary</span>
    <span class="n">revised_summary</span> <span class="o">=</span> <span class="n">review</span><span class="p">(</span><span class="n">parsed_blog</span><span class="p">[</span><span class="ss">:title</span><span class="p">],</span> <span class="n">parsed_blog</span><span class="p">[</span><span class="ss">:content</span><span class="p">],</span> <span class="n">summary</span><span class="p">,</span> <span class="n">examples</span><span class="p">,</span> <span class="n">max_attempts</span><span class="p">)</span>
    <span class="n">revised_summary</span><span class="p">[</span><span class="ss">:summary</span><span class="p">]</span>
  <span class="k">end</span>
</code></pre></div></div>

<p>Where the <code class="language-plaintext highlighter-rouge">generate</code> and <code class="language-plaintext highlighter-rouge">review</code> methods handle the interaction with the LLM using Langchain.rb.</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">generate</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">examples</span><span class="p">)</span>
    <span class="n">prompt</span> <span class="o">=</span> <span class="n">prompts</span><span class="p">.</span><span class="nf">generate_snippet</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">examples</span><span class="p">)</span>
    <span class="n">summary</span> <span class="o">=</span> <span class="n">client</span><span class="p">.</span><span class="nf">chat</span><span class="p">(</span><span class="n">prompt</span><span class="p">,</span> <span class="ss">system_prompt: </span><span class="n">prompts</span><span class="p">.</span><span class="nf">system</span><span class="p">)</span>
    <span class="k">raise</span> <span class="s2">"Incomplete snippet: </span><span class="si">#{</span><span class="n">summary</span><span class="si">}</span><span class="s2">"</span> <span class="k">unless</span> <span class="n">summary</span><span class="p">[</span><span class="ss">:title</span><span class="p">]</span> <span class="o">&amp;&amp;</span> <span class="n">summary</span><span class="p">[</span><span class="ss">:summary</span><span class="p">]</span>

    <span class="n">summary</span>
<span class="k">end</span>

<span class="k">def</span> <span class="nf">review</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">summary</span><span class="p">,</span> <span class="n">examples</span><span class="p">,</span> <span class="n">max_attempts</span><span class="p">)</span>
  <span class="n">attempt</span> <span class="o">=</span> <span class="mi">1</span>
  <span class="k">while</span> <span class="n">attempt</span> <span class="o">&lt;</span> <span class="n">max_attempts</span>
    <span class="n">prompt</span> <span class="o">=</span> <span class="n">prompts</span><span class="p">.</span><span class="nf">critic</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">summary</span><span class="p">[</span><span class="ss">:snippet</span><span class="p">],</span> <span class="n">examples</span><span class="p">)</span>
    <span class="n">review</span> <span class="o">=</span> <span class="n">client</span><span class="p">.</span><span class="nf">chat</span><span class="p">(</span><span class="n">prompt</span><span class="p">,</span> <span class="ss">system_prompt: </span><span class="n">prompts</span><span class="p">.</span><span class="nf">system</span><span class="p">)</span>

    <span class="k">return</span> <span class="n">summary</span> <span class="k">if</span> <span class="n">review</span><span class="p">[</span><span class="ss">:approved</span><span class="p">]</span>

    <span class="k">raise</span> <span class="s2">"Critic failed to provide a revised snippet"</span> <span class="k">unless</span> <span class="n">review</span><span class="p">[</span><span class="ss">:revised_summary</span><span class="p">]</span>

    <span class="n">summary</span><span class="p">[</span><span class="ss">:summary</span><span class="p">]</span> <span class="o">=</span> <span class="n">review</span><span class="p">[</span><span class="ss">:revised_summary</span><span class="p">]</span>
    <span class="n">attempt</span> <span class="o">+=</span> <span class="mi">1</span>
    <span class="k">break</span> <span class="k">if</span> <span class="n">attempt</span> <span class="o">&gt;</span> <span class="n">max_attempts</span>

  <span class="k">end</span>
  <span class="n">summary</span>
<span class="k">end</span>
</code></pre></div></div>

<p>This process allows the AI assistant to generate summaries that are not only accurate but also aligned with our brand’s voice and style.</p>

<h2 id="summary-re-generation">Summary Re-Generation</h2>

<p>If the AI assistant generates a summary that is not quite suited for the newsletter, it can be easily re-generated by the marketing team.</p>

<p>The team can simply click a button in the interface and add their feedback for the model to consider, and that will trigger the summary regeneration process. Optionally, they can also change the temperature of the LLM to make the output more or less creative.</p>

<p>Regenerating a summary is similar to the initial generation, but it includes the feedback provided by the marketing team:</p>

<div class="language-text highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;&lt;~PROMPT
  [Context: What kind of assistant is this?]

  You are correcting a snippet that has been suggested and rejected. When creating a snippet, you must always consider the following:

  1. Instruction number 1
  2. Instruction number 2

  [Task]

  [Additional instructions on how to handle the feedback provided.]

  **Feedback:**
  #{feedback.strip}

  **Blog Post:**

  *Title:* #{title.strip}

  *Content:*
  #{content.strip}

  **Previous Snippet:**
  #{snippet.strip}

  Return your response in this JSON format. Return ONLY the JSON object.
  {
    "title": "...",
    "snippet": "..."
  }
PROMPT
</code></pre></div></div>

<p>Regeneration does not include the review step, as the feedback is already provided by the marketing team:</p>

<div class="language-ruby highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">def</span> <span class="nf">regenerate</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">snippet</span><span class="p">,</span> <span class="n">feedback</span><span class="p">,</span> <span class="n">temperature</span><span class="p">)</span>
    <span class="n">prompt</span> <span class="o">=</span> <span class="n">prompts</span><span class="p">.</span><span class="nf">regenerate_snippet</span><span class="p">(</span><span class="n">title</span><span class="p">,</span> <span class="n">content</span><span class="p">,</span> <span class="n">snippet</span><span class="p">,</span> <span class="n">feedback</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">temperature</span>
      <span class="n">temperature</span> <span class="o">=</span> <span class="n">temperature</span><span class="p">.</span><span class="nf">to_f</span> <span class="o">/</span> <span class="mi">10</span>
      <span class="n">client</span><span class="p">.</span><span class="nf">temperature</span> <span class="o">=</span> <span class="n">temperature</span>
    <span class="k">end</span>
    <span class="n">revised_snippet</span> <span class="o">=</span> <span class="n">client</span><span class="p">.</span><span class="nf">chat</span><span class="p">(</span><span class="n">prompt</span><span class="p">,</span> <span class="ss">system_prompt: </span><span class="n">prompts</span><span class="p">.</span><span class="nf">system</span><span class="p">)</span>
    <span class="k">raise</span> <span class="s2">"Incomplete snippet: </span><span class="si">#{</span><span class="n">revised_snippet</span><span class="si">}</span><span class="s2">"</span> <span class="k">unless</span> <span class="n">revised_snippet</span><span class="p">[</span><span class="ss">:title</span><span class="p">]</span> <span class="o">&amp;&amp;</span> <span class="n">revised_snippet</span><span class="p">[</span><span class="ss">:snippet</span><span class="p">]</span>

    <span class="n">revised_snippet</span>
  <span class="k">end</span>
</code></pre></div></div>

<p>Our marketing team can then just copy the summary to use in the newsletter content, or tweak it further if needed.</p>

<h2 id="conclusion">Conclusion</h2>

<p>The AI assistant we built for the FastRuby.io newsletter has helped streamline our workflow, allowing our marketing team to focus on the creative aspects of curation while automating the repetitive tasks of gathering and summarizing links.</p>

<p>Through a mix of LLM-powered functionality, a simple interface, and a Slack integration, we have been able to create a tool that saves our marketing team a significant amount of operational time.</p>

<p>Want to know how we can help you leverage AI for your business? <a href="/#contact-us">Talk to us today!</a>.</p>]]></content><author><name>abizzinotto</name></author><category term="generative-ai" /><summary type="html"><![CDATA[Every other week, the FastRuby.io newsletter brings a curated list of the best Ruby and Rails articles, tutorials, and news to your inbox. Our engineering team collects links to interesting articles and our marketing team curates them, writes a summary for each article, and creates the newsletter. This process is quite manual, and involves some back and forth to ensure summaries are accurate, engaging, and relevant to our audience. To make if more efficient, we have developed an AI assistant that helps us curate articles and generate the summaries for the newsletter.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/ai-assistant-for-fr-newsletter.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/ai-assistant-for-fr-newsletter.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry><entry><title type="html">Parallax Proves High-Value Concept with OmbuLabs’ Predictive ML Model</title><link href="https://www.ombulabs.ai/blog/parallax-proves-predictive-model-concept.html" rel="alternate" type="text/html" title="Parallax Proves High-Value Concept with OmbuLabs’ Predictive ML Model" /><published>2025-05-29T08:00:00-04:00</published><updated>2025-05-29T08:00:00-04:00</updated><id>https://www.ombulabs.ai/blog/parallax-proves-predictive-model-concept</id><content type="html" xml:base="https://www.ombulabs.ai/blog/parallax-proves-predictive-model-concept.html"><![CDATA[<p>Parallax was beginning to explore the use of <a href="/blog/tags/artificial-intelligence">artificial intelligence</a> (AI) or machine learning (ML) to leverage the wealth of data on hand about customer projects, with the goal of improving their resource planning.
The company thought it might be possible to create a machine learning model that identifies customer projects at risk, equipping the Customer Success team to make data-driven recommendations on how to head off problems before they occur.</p>

<!--more-->

<h2 id="background">Background</h2>

<p>Founded in 2019, <a href="https://www.getparallax.com">Parallax</a> helps digital service organizations optimize operations using sophisticated tools that improve capacity planning and resource planning and management.
The Minnesota-based company equips small and mid-sized organizations to align people with work, enabling them to make hiring and staff utilization decisions that drive higher growth and profitability.</p>

<h2 id="context">Context</h2>

<p>Parallax was beginning to explore the use of artificial intelligence (AI) or <a href="/blog/tags/machine-learning">machine learning</a> (ML) to leverage the wealth of data on hand about customer projects, with the goal of improving their resource planning.
The company thought it might be possible to create a machine learning model that identifies customer projects at risk, equipping the Customer Success team to make data-driven recommendations on how to head off problems before they occur.</p>

<p>But as a SaaS startup, Parallax needs to focus internal capacity on developing solutions that fulfill contractual obligations and enhancing existing services.
The company doesn’t have the luxury of tying up staff to explore a proof of concept.</p>

<blockquote>
  <p>“By working with OmbuLabs, we could quickly experiment and iterate on a new possibility outside our core product, without cannibalizing our team’s time or impacting our product development velocity,” — Jacob Ward - Head of Product at Parallax.</p>
</blockquote>

<h2 id="exploration">Exploration</h2>

<p>The exploratory engagement began with a focus on two questions about the Parallax customer base:</p>

<ul>
  <li>Is our customer doing a good (or poor) job at resource planning?</li>
  <li>What are the most common problems that derail our customers’ resource plans?</li>
</ul>

<p>Our goal was to use ML-generated insights derived from these questions to develop a solution that delivered value to customers using its planning tools.</p>

<p>We conducted extensive data analysis and data validation centered around Parallax customers’ typical planning scenarios.
Through collaboration and discussions with our client, it became apparent that the original questions weren’t the most relevant or pressing to address.
So our focus shifted to helping Parallax identify the actual problem that needed to be solved, and whether machine learning was the right solution.</p>

<p>Together, we determined it would be more valuable to customers if Parallax used ML as a predictive tool to guide resource planning dynamically. The new questions became:</p>

<ul>
  <li>Can we build a model that predicts the number of hours each role will log on a given project each day?</li>
  <li>How would those predictions improve our customers’ resource planning?</li>
</ul>

<h2 id="our-approach-for-a-successful-engagement">Our Approach for a Successful Engagement</h2>

<p>We analyzed a large volume of historical project data across the Parallax customer base to determine the best approach to addressing these questions.</p>

<blockquote>
  <p>“They worked collaboratively with our engineers to get the data into a format they could model and execute against.”
“It felt like they were an extension of our team, helping us explore the concept.” — Ward explained.</p>
</blockquote>

<p>The team kept iterating to understand what they could do with the data and how to move forward with the idea in a way that solved a relevant customer challenge.</p>

<p>Through a process of exploring, prototyping, and validating, we built a custom regression model from the ground up.
Our team trained the model on a large number of completed projects across the Parallax ecosystem and fine-tuned it to take multiple factors into consideration.
Then we developed an application in Python that allows Parallax’s C# platform to interact with the machine learning within its existing infrastructure.
The application also applies statistical modeling to the model’s predictions to adjust calculations and improve the confidence intervals.</p>

<h2 id="the-outcome-a-predictive-model">The Outcome: A Predictive Model</h2>

<p>At the conclusion of the engagement, Parallax gained a custom predictive model and a working API hosted in Azure.
The solution extracts relevant historical data across all Parallax customer projects, roles, and employees, compares it to the conditions of a current customer project, and predicts what will happen next.
The model returns a forecast with confidence intervals bound by upper and lower thresholds.</p>

<p>The model can predict how many hours a particular role will log on a project daily based on planned hours and historical trends, taking into account any deviations specific to the customer’s organization.
As the customer adds new roles or employees to the project, the model responds dynamically, accounting for their workload and availability across all their assignments.</p>

<p>For example, the customer’s resource plan might assume that Person A will log 25 hours on the project this week and Person B will log 18 hours.
But the model might predict very different activity levels, which could significantly impact staff utilization and profitability.
Equipped with this information, a digital service company can take proactive steps early enough to course-correct and prevent costly problems.</p>

<h2 id="next-steps">Next Steps</h2>

<p>Parallax was extremely pleased with both the process and the end result.</p>

<blockquote>
  <p>“This collaboration represents a successful pivot from exploratory research into a deployable, value-driving tool.”
“It laid the foundation for deeper strategic applications within our platform, which could help teams plan more effectively and make informed, autonomous decisions through agents and predictive intelligence,” — Ward noted.</p>
</blockquote>

<p>Ward described the process as a valuable learning experience.</p>

<blockquote>
  <p>“We invested in working with a partner that knew how to guide us down a path of understanding what we could do with our data and how machine learning and AI could make an impact for our customers.”</p>
</blockquote>

<p>Parallax especially valued our domain expertise and ability to evaluate how to use data to meet a customer need.</p>

<blockquote>
  <p>“They consulted us on what our data could do and guided us to a solution that was viable for end users”
“It proved our hypothesis: that what would be valuable for customers is actually possible to deliver.” — Ward said.</p>
</blockquote>

<p>The resulting model could significantly impact a digital service organization’s top and bottom line.</p>

<blockquote>
  <p>“Our customers will have revenue leakage and make a lot less profit if they don’t address problems early,” he explained.
“This model can provide visibility into where they’re going to end up, so they can make better staffing and resourcing decisions based on better predictability.”</p>
</blockquote>

<p>Parallax hasn’t deployed the model within its tools yet, as it’s still a working prototype that would need more training on more data. Eventually it could become a commercialized product, whether built into one of the company’s existing tools or offered as a chargeable add-on. Regardless, the company has a blueprint for the future.</p>

<blockquote>
  <p>“We know we can do this,” Ward said.
“We know it works. And OmbuLabs provided the documentation and knowledge transfer for us to own this. It’s one of multiple paths for our future intelligence strategy.”</p>
</blockquote>

<hr />
<h2 id="project-type">Project type:</h2>

<ul>
  <li>Technology consulting engagement</li>
  <li>Machine Learning model development</li>
</ul>

<h2 id="built-using">Built using:</h2>
<ul>
  <li><strong>Language:</strong> Python;</li>
  <li><strong>Frameworks:</strong> FastAPI, scikit-learn;</li>
  <li><strong>Model Deployment:</strong> MLflow;</li>
  <li><strong>Hosting:</strong> Azure;</li>
</ul>

<hr />

<p>Want to build something amazing with OmbuLabs? Check out our <a href="https://www.ombulabs.ai/design-sprint">one-week Design Sprint service</a>! We can take you from idea to prototype in 5 days. 🚀</p>]]></content><author><name>abizzinotto</name></author><category term="machine-learning" /><summary type="html"><![CDATA[Parallax was beginning to explore the use of artificial intelligence (AI) or machine learning (ML) to leverage the wealth of data on hand about customer projects, with the goal of improving their resource planning. The company thought it might be possible to create a machine learning model that identifies customer projects at risk, equipping the Customer Success team to make data-driven recommendations on how to head off problems before they occur.]]></summary><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://www.ombulabs.ai/blog/parallax-high-value-concept.jpg" /><media:content medium="image" url="https://www.ombulabs.ai/blog/parallax-high-value-concept.jpg" xmlns:media="http://search.yahoo.com/mrss/" /></entry></feed>