Amey Lokare

Jan 15, 2025 • 4 min read • AI & Machine Learning

RAG vs Fine-Tuning: When to Use Which for Your AI Application

RAG and fine-tuning are two ways to customize AI models, but they solve different problems. Here's when to use RAG, when to fine-tune, and when to use both.

AI Machine Learning LLM RAG Best Practices

Dec 27, 2024 • 3 min read • AI & Machine Learning

Building RAG: Why My First Three Attempts Failed

I wanted to build a RAG system for my documentation. Three attempts, three failures. Here's what went wrong each time, why it failed, and what finally worked.

AI LLM RAG Vector Databases

Dec 25, 2024 • 2 min read • AI & Machine Learning

Running Multiple LLMs: My GPU Memory Management Nightmare

I wanted to run multiple LLMs simultaneously on my GPU. Simple goal, right? Wrong. GPU memory management became a nightmare. Here's what I learned the hard way.

AI GPU LLM Optimization

Dec 24, 2024 • 3 min read • AI & Machine Learning

Gemini 2.0 vs GPT-4o: I Tested Both for Real Work

I spent a week testing Gemini 2.0 and GPT-4o side-by-side on real work tasks. Not benchmarks or demos—actual coding, writing, and analysis. Here's what I found, when to use which, and the real costs.

AI LLM

Dec 20, 2024 • 3 min read • AI & Machine Learning

Fine-Tuning LLMs: The Expensive Truth Nobody Talks About

I spent $2,400 fine-tuning a language model, thinking it would solve my problem. Three months later, I realized I could have achieved 90% of the results with prompt engineering for $0. Here's the expensive truth about fine-tuning that nobody tells you.

AI Machine Learning LLM

Dec 18, 2024 • 4 min read • AI & Machine Learning

I Built a Local LLM Chat App—Here's What Actually Works

After trying three different approaches to build a local LLM chat interface, I finally found what works. Here's what failed, what succeeded, and the real performance numbers you won't find in tutorials.

Laravel AI LLM Ollama Experience

Dec 13, 2024 • 7 min read • AI & Machine Learning

Why 24GB VRAM Is the New Minimum for Serious AI Work

Modern AI models are breaking on 12GB cards. After running local LLMs, training models, and deploying AI systems, I've learned that 24GB VRAM is now the practical minimum for serious AI work. Here's why, and what it means for your hardware choices—comparing RTX 3090, 4090, and A6000.

Machine Learning GPU LLM CUDA Hardware ameylokare amey lokare lokare amey VRAM AI Hardware NVIDIA RTX 3090 RTX 4090 A6000 Deep Learning GPU Computing AI Training Model Inference

Dec 06, 2024 • 1 min read • AI & Machine Learning

My Home AI Lab Setup — GPU Computing for Local LLMs

From experimenting with speech-to-text to training lightweight predictive models, I've created a personal AI lab at home powered by high-end consumer hardware. The goal? Run local LLMs, real-time voice agents, VLMs, and GPU-accelerated automation workflows without relying on cloud costs.

AI Machine Learning GPU LLM Hardware Home Lab

Dec 02, 2024 • 7 min read • AI & Machine Learning

AI Voice Agents for Customer Support Using Asterisk + LLMs

Build production-grade AI voice agents that understand natural language, access knowledge bases, and handle real customer calls. Complete guide with Asterisk, Whisper, LLMs, and RAG—achieving 73% automation with 2.7s response time.

Asterisk VoIP Automation AI Machine Learning LLM Whisper

Nov 30, 2024 • 4 min read • AI & Machine Learning

My Home AI Lab Setup — GPU Computing for Local LLMs

A complete breakdown of my personal AI lab—running local LLMs, real-time voice agents, and GPU-accelerated workflows without cloud costs. Hardware specs, software stack, real-world use cases, and why local AI is the future.

AI Machine Learning GPU LLM CUDA PyTorch Proxmox NVMe RTX Whisper ComfyUI

Posts Tagged: LLM

RAG vs Fine-Tuning: When to Use Which for Your AI Application

Building RAG: Why My First Three Attempts Failed

Running Multiple LLMs: My GPU Memory Management Nightmare

Gemini 2.0 vs GPT-4o: I Tested Both for Real Work

Fine-Tuning LLMs: The Expensive Truth Nobody Talks About

I Built a Local LLM Chat App—Here's What Actually Works

Why 24GB VRAM Is the New Minimum for Serious AI Work

My Home AI Lab Setup — GPU Computing for Local LLMs

AI Voice Agents for Customer Support Using Asterisk + LLMs

My Home AI Lab Setup — GPU Computing for Local LLMs

Dive Deeper into LLM