<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[SwirlAI Newsletter]]></title><description><![CDATA[Learn about end-to-end Data Systems and stay up to date with what is happening in the Data World.]]></description><link>https://www.newsletter.swirlai.com</link><image><url>https://substackcdn.com/image/fetch/$s_!JA65!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png</url><title>SwirlAI Newsletter</title><link>https://www.newsletter.swirlai.com</link></image><generator>Substack</generator><lastBuildDate>Tue, 28 Apr 2026 11:34:37 GMT</lastBuildDate><atom:link href="https://www.newsletter.swirlai.com/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Aurimas Griciūnas]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[swirlai@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[swirlai@substack.com]]></itunes:email><itunes:name><![CDATA[Aurimas Griciūnas]]></itunes:name></itunes:owner><itunes:author><![CDATA[Aurimas Griciūnas]]></itunes:author><googleplay:owner><![CDATA[swirlai@substack.com]]></googleplay:owner><googleplay:email><![CDATA[swirlai@substack.com]]></googleplay:email><googleplay:author><![CDATA[Aurimas Griciūnas]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[State of Context Engineering in 2026]]></title><description><![CDATA[Five patterns for managing what goes into the context window, and when.]]></description><link>https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sun, 22 Mar 2026 12:22:06 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d47b96f2-27c5-4b8b-abb5-12d3494b04f7_5738x3255.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Context engineering has gone from a niche concern to the core discipline of AI engineering in under a year. In mid-2025, two posts laid much of the groundwork. Manus shared lessons from rebuilding their agent framework four times (July 2025). Anthropic followed with their guide on effective context engineering for agents (September 2025). That was eight months ago. The patterns they described have since matured and were adopted across platforms.</p><p>I previously wrote about the foundations of context engineering in one of my previous articles: </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;55fbed83-c26f-4ff8-ae97-da6c2b87d8e1&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Breaking Down Context Engineering&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-08-30T07:01:29.799Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ff8da46-fa9f-4589-89ac-46d5ed749787_836x568.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/breaking-down-context-engineering&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:171633835,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:51,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1144171,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>This piece focuses on what has changed since then, the patterns that have matured, and the tradeoffs you need to understand as an AI engineer.</p><p></p><h3>Why Context Engineering Matters More Than Model Capability</h3><p>The core insight is simple. LLMs have a finite attention budget. Every token in the context window competes for that attention. As context grows, precision drops, reasoning weakens, and the model starts missing information it should catch. Research calls this &#8220;lost-in-the-middle&#8221; and &#8220;needle in the haystack&#8220; problems.</p><p>Anthropic frames it well: context engineering means finding the smallest possible set of high-signal tokens that maximise the likelihood of desired outcomes. The discipline covers everything that lands in the context window: system instructions, tool definitions, MCP resources, retrieved documents, conversation history, and accumulated action history.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3s1r!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3s1r!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 424w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 848w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 1272w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3s1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png" width="1456" height="827" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d3e31d83-019e-4186-99c0-0404bb82acb4_4703x2672.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:827,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:530041,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd3e31d83-019e-4186-99c0-0404bb82acb4_4703x2672.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3s1r!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 424w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 848w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 1272w, https://substackcdn.com/image/fetch/$s_!3s1r!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F596bbacc-003c-40e5-9df0-dd88ca81da84_4703x2672.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Context Window</figcaption></figure></div><p>Several patterns have emerged for managing context effectively: progressive disclosure for controlling what loads and when, compression for shrinking accumulated history, routing for directing queries to the right source, evolved retrieval strategies for getting external knowledge on demand, and tool management for controlling the capability surface.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YK76!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YK76!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 424w, https://substackcdn.com/image/fetch/$s_!YK76!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 848w, https://substackcdn.com/image/fetch/$s_!YK76!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 1272w, https://substackcdn.com/image/fetch/$s_!YK76!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YK76!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png" width="1456" height="827" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6aaf852f-d15a-4df9-81a5-183017a50c70_5073x2881.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:827,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:452961,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6aaf852f-d15a-4df9-81a5-183017a50c70_5073x2881.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YK76!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 424w, https://substackcdn.com/image/fetch/$s_!YK76!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 848w, https://substackcdn.com/image/fetch/$s_!YK76!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 1272w, https://substackcdn.com/image/fetch/$s_!YK76!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc2eaa9d1-5c79-4f1d-ae8b-08db857117ba_5073x2881.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Context Engineering Patterns</figcaption></figure></div><p> Each addresses a different dimension of the problem. The rest of this piece breaks them down.</p><div><hr></div><p style="text-align: center;">I ran a live workshop on this topic where I walked through these patterns.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://maven.com/p/0bd8ae/state-of-context-engineering-in-2026&quot;,&quot;text&quot;:&quot;Watch The Recording&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://maven.com/p/0bd8ae/state-of-context-engineering-in-2026"><span>Watch The Recording</span></a></p><div><hr></div><p></p><h2>Pattern 1: Progressive Disclosure and Agent Skills</h2><p></p><h3>The problem</h3><p>An agent that handles customer support, billing, refunds, and onboarding needs instructions for all four domains. Loading all instructions upfront wastes most of the context window on irrelevant guidance. The traditional alternative, spinning up separate specialised sub-agents, adds orchestration complexity, duplicates shared logic, and introduces latency from inter-agent communication. Neither scales well as the number of domains grows.</p><p></p><h3>What the pattern does</h3><p>Progressive disclosure loads information in tiers based on relevance. Discovery first (just names and descriptions), activation when relevant (full instructions), execution only during the task (scripts and reference materials).</p><p>Agent Skills are the standard implementation. I covered them in depth in my previous piece:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;8f2c83b0-c003-49f6-8379-4656741f5e82&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Agent Skills: Progressive Disclosure as a System Design Pattern&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2026-03-11T10:38:30.059Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f8b195c7-bc33-4bee-be52-984301bfdd43_4157x2209.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:190516304,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:21,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1144171,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The short version: a skill is a markdown file with YAML frontmatter. The platform reads only the name and description at startup (~80 tokens median per skill). When the model determines a skill is relevant, the full instruction body loads (275 to 8,000 tokens). Supporting scripts and reference materials load only during execution. The format was released by Anthropic in December 2025 and adopted by OpenAI, Google, GitHub, and Cursor within weeks.</p><p>The most interesting application is agent identity management. Rather than routing queries to separate specialised sub-agents, a single agent assumes different identities on demand. At rest, it has a base identity. When a task activates a skill, the agent adopts that skill&#8217;s instructions, constraints, tone, and behavioral patterns. When the task completes, it returns to base. This is what Claude Code already does. It does not spin up a separate &#8220;PDF agent&#8221; and a &#8220;spreadsheet agent.&#8221; It is one agent that activates the relevant skill, shifting its identity to match.</p><p>Agent Skills are not only for coding agents. The pattern generalises to customer support, internal operations, research agents, and any system where agents need broad capability with focused execution. Because skills are plain English markdown, domain experts and team leads can configure agent behavior directly, without engineering expertise.</p><p>An interesting extension: agents that write their own skills. When an agent encounters a task it handles repeatedly, it can extract the pattern into a new skill file. Claude Code supports this through its skill-creator skill. The agent observes its own successful behavior, generalises it, and makes it available for future sessions. The quality of self-authored skills varies, but the direction closes the loop: humans write the initial skills, agents extend the library from experience.</p><p></p><p><strong>Tradeoffs.</strong></p><ul><li><p><strong>Accuracy</strong>: High with a small skill set, but degrades with 100+ as overlapping descriptions cause misactivation.</p></li><li><p><strong>Latency</strong>: Low. Discovery data is pre-loaded, activation adds a file read, not an LLM call.</p></li><li><p><strong>Token cost</strong>: Low at rest (all 17 Anthropic skills cost ~1,700 tokens at discovery), but accumulates during a session. The key unsolved question: when does an activated skill get deactivated? Without explicit pruning logic, multiple activated skills destroy the token advantage over time.</p></li><li><p><strong>Maintainability</strong>: Easy per skill, harder at scale. 50+ skills with non-overlapping descriptions requires governance.</p></li><li><p><strong>Reliability</strong>: Moderate. Skill selection errors compound downstream, and the entire approach depends on selection accuracy at the discovery layer.</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h2>Pattern 2: Context Compression</h2><p></p><h3>The problem</h3><p>Every tool call, every observation, every reasoning step adds to the context. Each tool result can be hundreds or thousands of tokens: API responses, file contents, search results, error traces. Without intervention, the accumulated action history fills the context window and pushes out the system instructions, tool definitions, and early task context that the model actually needs to reason well.</p><p>A good example to showcase this is a simple ReAct agent as displayed in the image below:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Inou!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Inou!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 424w, https://substackcdn.com/image/fetch/$s_!Inou!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 848w, https://substackcdn.com/image/fetch/$s_!Inou!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 1272w, https://substackcdn.com/image/fetch/$s_!Inou!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Inou!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/02f5663c-358b-475a-8b6c-fae1d8dd3fea_11127x6287.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2503018,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F02f5663c-358b-475a-8b6c-fae1d8dd3fea_11127x6287.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Inou!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 424w, https://substackcdn.com/image/fetch/$s_!Inou!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 848w, https://substackcdn.com/image/fetch/$s_!Inou!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 1272w, https://substackcdn.com/image/fetch/$s_!Inou!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f3d2de0-9c4c-4618-b9a8-e8c7316d9d96_11127x6287.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">ReAct Agent Context Bloat</figcaption></figure></div><ol><li><p>Construct a prompt from system instructions and available tools.</p></li><li><p>User query added to the initial prompt and passed to a LLM triggers the LLM to output either the final answer to the users query or plan additional actions via tool use in the environment.</p></li><li><p>If tool use is chosen, the tools get executed. </p></li><li><p>The results piped back to the initial prompt by appending them to the reasoning and action history.</p></li><li><p>Then we repeat the loop for N times or until the users query can be answered. </p></li></ol><p>Each of the turns adds to the context as conversation history expands. This is extremely troublesome when the tools are retrieving large amounts of context.</p><p></p><h3>What the pattern does</h3><p>Context compression shrinks accumulated history while preserving the information the model needs. There are few approaches of how the compressions could be handled:</p><ul><li><p>Keep only top N turns of interaction, discard the remaining.</p></li><li><p>Sliding window compression: keep top N turns unchanged, compress the remaining using LLMs.</p></li><li><p>Long-term Memory approach: keep top N turns unchanged, move the remaining history to a durable storage and retrieve only relevant actions on-demand.  </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3wO3!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3wO3!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 424w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 848w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 1272w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3wO3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8dd10ee2-4c31-4ee6-9a10-a1ceb2ff5388_11127x6287.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3324827,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8dd10ee2-4c31-4ee6-9a10-a1ceb2ff5388_11127x6287.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3wO3!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 424w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 848w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 1272w, https://substackcdn.com/image/fetch/$s_!3wO3!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F77f7b296-80d1-4f86-925a-8e3eb3df9d56_11127x6287.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Conversation and Action History Compression</figcaption></figure></div><p>The field has converged on sliding window plus summarisation hybrids as the dominant approach: keep recent turns in full detail, compress older context through LLM-based summarisation.</p><p>Manus adds two practical details. First, keep the most recent tool calls in raw format so the model maintains its "rhythm" and formatting style. Losing that rhythm leads to subtle degradation. Second, do not compress away error traces. When a tool call fails, leaving the error and stack trace in context helps the model avoid repeating the same mistake. This technique is well-established (libraries like Instructor use it for structured output retries), and it applies broadly to any agent that calls tools.</p><p></p><p><strong>Tradeoffs.</strong></p><ul><li><p><strong>Accuracy</strong>: Moderate. Summarisation preserves the gist but loses details, and any compression is lossy.</p></li><li><p><strong>Latency</strong>: Moderate. Each compression step requires an LLM call. You can amortise this by compressing periodically instead of each new turn.</p></li><li><p><strong>Token cost</strong>: Low in many cases when long running agents are involved.</p></li><li><p><strong>Maintainability</strong>: Requires experimentation: what to keep raw, how many turns before compacting, what detail level in summaries.</p></li><li><p><strong>Reliability</strong>: Moderate. Works well for long-horizon tasks, but poorly when critical early details get summarised away.</p></li></ul><div><hr></div><p style="text-align: center;">Learn these patterns hands-on in my End-to-end AI Engineering Bootcamp (3 days left to register before next cohort starts). Apply code <strong>LASTCHANCE15</strong> for 15% off.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://maven.com/swirl-ai/end-to-end-ai-engineering&quot;,&quot;text&quot;:&quot;Register Here&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://maven.com/swirl-ai/end-to-end-ai-engineering"><span>Register Here</span></a></p><div><hr></div><p style="text-align: center;"></p><h2>Pattern 3: Context Routing</h2><p></p><h3>The problem</h3><p>A multi-domain agent has access to multiple knowledge bases, tool sets, and instruction sets. Loading all of them for every query wastes context and degrades accuracy. A billing question does not need the onboarding knowledge base. A technical support query does not need the refund policy.</p><p></p><h3>What the pattern does</h3><p>Context routing classifies the query and directs it to the right context source before anything enters the context window. Several approaches have emerged:</p><p><strong>LLM-powered routing</strong> uses the model itself to classify the query and select the appropriate context source. More accurate than rule-based approaches, but adds latency and cost.</p><p><strong>Hierarchical routing</strong> uses a lead agent to triage queries to specialised sub-agents, each with its own focused context window.</p><p><strong>Rule-based routing</strong> uses keyword matching or pattern detection. Fast and predictable, but rigid and unreliable when queries don&#8217;t match expected patterns.</p><p><strong>Hybrid routing</strong> combines multiple methods.</p><p></p><p><strong>Tradeoffs.</strong></p><ul><li><p><strong>Accuracy</strong>: High for LLM-based routing, moderate for rule-based. LLM routing understands nuance, rule-based misses anything outside expected patterns.</p></li><li><p><strong>Latency</strong>: Varied. LLM routing adds an inference call before the main task. Rule-based is near-instant. Most production systems combine both.</p></li><li><p><strong>Token cost</strong>: Savings come downstream: by loading only relevant context, you reduce tokens for the main inference.</p></li><li><p><strong>Maintainability</strong>: Rule-based routing requires manual updates for new domains. LLM routing adapts automatically but is harder to debug when it misroutes.</p></li><li><p><strong>Reliability</strong>: LLM routing can hallucinate routing decisions. Fallback to a human or default agent is necessary.</p></li></ul><p></p><h2>Pattern 4: Retrieval Evolution</h2><p></p><h3>The problem</h3><p>Agents need knowledge that is not in their training data: company documents, product catalogs, policy updates, real-time data. The naive approach (retrieve similar text, stuff it into the prompt, generate) fails on complex queries. A question like &#8220;what themes emerge across this quarter&#8217;s customer feedback?&#8221; requires connecting information across multiple documents, something vector similarity search cannot do. A question where the first retrieval returns insufficient results needs a second attempt with a reformulated query, something a fixed pipeline cannot do.</p><p></p><h3>What the pattern does</h3><p>RAG has matured from fixed pipelines to agent-controlled retrieval loops. Three evolutions stand out.</p><p>I wrote about evolution of RAG architectures in one of my previous articles:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;80dd148e-9765-4a8b-9af9-e8289cbd35e9&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The evolution of Modern RAG Architectures.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-07T07:43:33.250Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7430fbad-21da-4918-88cc-3d593254f310_2789x2392.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/the-evolution-of-modern-rag-architectures&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159546301,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:107,&quot;comment_count&quot;:2,&quot;publication_id&quot;:1144171,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p><strong>Agentic RAG</strong> puts retrieval under agent control. Instead of a fixed pipeline (query &#8594; vector search &#8594; inject &#8594; generate), the agent decides its own search strategy, can reformulate queries when results are insufficient, and iterates until confident. The retrieval loop replaces the retrieval pipeline.</p><p><strong>Graph RAG</strong> adds relational reasoning. Standard vector search finds similar text but cannot connect entities across documents. Graph-based approaches build entity-relationship graphs over the corpus, enabling thematic and relational questions that require connecting information across multiple sources.</p><p><strong>Self-RAG</strong> trains models to decide when to retrieve and to critique their own outputs. The model assesses whether it has enough information before answering, triggers retrieval only when needed, and evaluates the quality of retrieved results before using them.</p><p>The most advanced work combines all three:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J6Rx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J6Rx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 424w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 848w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 1272w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J6Rx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png" width="1456" height="1160" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7341c1f3-4cae-4019-b767-21f7c287c9c1_6058x4827.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1160,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1254849,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7341c1f3-4cae-4019-b767-21f7c287c9c1_6058x4827.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J6Rx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 424w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 848w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 1272w, https://substackcdn.com/image/fetch/$s_!J6Rx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c39f377-717b-446f-a5c9-5c34916c41c2_6058x4827.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG +</figcaption></figure></div><p></p><p><strong>Tradeoffs.</strong></p><ul><li><p><strong>Accuracy</strong>: The strongest dimension. Agent-controlled retrieval can reformulate queries, try multiple strategies, and iterate until confident.</p></li><li><p><strong>Latency</strong>: High. A single question might trigger three to five retrieval cycles.</p></li><li><p><strong>Token cost</strong>: High. Each cycle adds retrieved chunks to context plus the agent&#8217;s reasoning about strategy. Cost scales with question complexity.</p></li><li><p><strong>Maintainability</strong>: Moderate. Debugging &#8220;why did the agent choose this retrieval strategy?&#8221; is harder than debugging a fixed pipeline.</p></li><li><p><strong>Reliability</strong>: Needs guardrails. Agentic RAG can over-retrieve on simple questions, so maximum retrieval rounds, confidence thresholds, and fallback to direct generation are essential.</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/state-of-context-engineering-in-2026?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h2>Pattern 5: Tool and Capability Management</h2><p></p><h3>The problem</h3><p>Agents need tools to interact with the world: APIs, databases, file systems, search engines. Each tool requires a JSON schema definition that the model reads to understand what the tool does and how to call it. A single complex schema (nested objects, enums, parameter descriptions) can consume 500+ tokens. Connect a few MCP servers and you might reach 90+ tool definitions, over 50,000 tokens of schemas before the model starts reasoning. This is not a theoretical concern. OpenAI recommends fewer than 20 tools per agent, with accuracy degrading past 10.</p><p></p><h3>What the pattern does</h3><p>MCP (Model Context Protocol) has become the standard for connecting agents to external tools. Originally released by Anthropic in November 2024, it is now governed by the Agentic AI Foundation under the Linux Foundation.</p><p>MCP solves the connection problem. The context cost problem remains unsolved.</p><p>You can read more about the protocol in one of my articles here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6dae4f4a-b08a-4be1-ae3a-82a94c00e3cf&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Everything you need to know about MCP.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-03-15T15:16:01.285Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c348e1b-a175-4c65-8ea6-d773f957488e_1934x1554.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/everything-you-need-to-know-about&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159065609,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:113,&quot;comment_count&quot;:7,&quot;publication_id&quot;:1144171,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>Manus found a practical constraint: avoid dynamically adding or removing tools mid-iteration, because tool definitions sit near the front of the context, and any change invalidates the KV-cache for all subsequent actions.</p><p>Beyond raw context cost, tool management introduces several open problems:</p><p><strong>Description quality.</strong> The model selects tools based on descriptions, but most MCP server authors write descriptions for humans, not models. Too vague and the model picks the wrong tool. Too verbose and you waste context on a single schema.</p><p><strong>Tool overlap across MCP servers.</strong> Two different servers might offer similar capabilities (two search tools, two file readers). Without deduplication or preference logic, the model picks arbitrarily.</p><p><strong>No versioning for tool contracts.</strong> When an MCP server updates its tool schemas, the agent has no way to know. Stale descriptions in cache cause silent failures.</p><p><strong>Security surface scales with tool count.</strong> Each connected MCP server is an attack surface. Tool outputs can contain prompt injection attempts, and the more tools available, the larger the exposure.</p><p></p><p><strong>Tradeoffs.</strong></p><ul><li><p><strong>Accuracy</strong>: Depends entirely on description quality, which is unsolved.</p></li><li><p><strong>Latency</strong>: Low for discovery, but changing tools mid-iteration invalidates the KV-cache, adding significant latency.</p></li><li><p><strong>Token cost</strong>: The biggest hidden cost in many agent systems. A single complex JSON schema can consume 500+ tokens. 90 tools means 50K+ tokens before any user interaction.</p></li><li><p><strong>Maintainability</strong>: MCP standardises the interface, but not description quality, schema conventions, or versioning.</p></li><li><p><strong>Reliability</strong>: Moderate. Again, MCP standardises the interface but not the quality of the underlying tool. Each connected server expands the attack surface.</p></li></ul><p></p><h2>Putting It Together</h2><p>These patterns are not alternatives. In a production agent system, you layer them: progressive disclosure and tool management define what can enter the context window, routing and compression manage what stays during execution, retrieval brings in external knowledge on demand, and evaluation measures whether any of it is working. Each layer addresses a different failure mode.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4JEU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4JEU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 424w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 848w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 1272w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4JEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d58135b-23af-4d0c-8a9b-a9d13128420a_5738x3255.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:864725,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191721101?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d58135b-23af-4d0c-8a9b-a9d13128420a_5738x3255.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4JEU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 424w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 848w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 1272w, https://substackcdn.com/image/fetch/$s_!4JEU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbd27321c-34b4-4e96-ad2b-e63dbf286ae4_5738x3255.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Context Engineering: Layered Architecture</figcaption></figure></div><p></p><h2>Where to Start</h2><p>If your agents run long tasks, add compression first. Hybrid sliding window (keep the latest N turns raw, summarise older ones) is the most practical starting point, and probe-based evaluation will tell you whether your summaries are preserving what matters.</p><p>If your agents serve multiple domains, add routing. Even keyword-based rules cut context bloat before you invest in LLM-based classification.</p><p>If your agents connect to multiple MCP servers, audit the token cost. Count how many tokens your tool schemas consume before any user interaction. That number is usually higher than expected.</p><p>Hope you enjoyed this piece and hope to see you in the next one, cheers!</p><p></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTkwNTE2MzA0LCJpYXQiOjE3NzQxODAzNzIsImV4cCI6MTc3Njc3MjM3MiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.LLuKZRByXFbkkcoItKP1zncAuVAOtWLxhe3eJ2wlkwc&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTkwNTE2MzA0LCJpYXQiOjE3NzQxODAzNzIsImV4cCI6MTc3Njc3MjM3MiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.LLuKZRByXFbkkcoItKP1zncAuVAOtWLxhe3eJ2wlkwc"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p>]]></content:encoded></item><item><title><![CDATA[New: SwirlAI on YouTube + Context Engineering Workshop]]></title><description><![CDATA[Hands-on AI engineering content, now in video form, plus a live workshop on context engineering this Friday]]></description><link>https://www.newsletter.swirlai.com/p/new-swirlai-on-youtube-context-engineering</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/new-swirlai-on-youtube-context-engineering</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Tue, 17 Mar 2026 10:51:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/07f7b5b3-00f2-443a-9d9c-fe852717d794_1506x1222.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2><strong>SwirlAI YouTube Channel is Live</strong></h2><p>I have officially launched a YouTube channel. The first four videos are already up as a free pre-course for the End-to-End AI Engineering Bootcamp.</p><p>One thing I see consistently: people jump straight into building with LLMs without a proper development environment. They run everything locally with no containerisation, hardcode API keys, and tangle frontend and backend logic together. It works until it doesn&#8217;t. When they need to collaborate, deploy, or debug, everything falls apart.</p><p>A solid development setup is not optional for AI engineers. It&#8217;s the foundation everything else builds on. These four videos walk you through it from scratch:</p><div><hr></div><p><strong>Part 1: Setting Up Your Development Environment</strong> Configure your development environment and LLM API keys from scratch.</p><div id="youtube2-3WsEgLtwsGs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;3WsEgLtwsGs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/3WsEgLtwsGs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p><strong>Part 2: Build and Containerise Your First Chatbot</strong> Build a Streamlit-based chatbot and deploy it with Docker Compose.</p><div id="youtube2-cGOGA7GFecI" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;cGOGA7GFecI&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/cGOGA7GFecI?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p><strong>Part 3: Moving the Agent Behind a FastAPI Server</strong> Extract the LLM calling logic into a FastAPI REST server and run it as a separate service.</p><div id="youtube2-bZHaCAFAUCs" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;bZHaCAFAUCs&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/bZHaCAFAUCs?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p><strong>Part 4: Containerising Backend and Frontend</strong> Split the application into separate FastAPI and Streamlit containers, each deployed independently via Docker Compose.</p><div id="youtube2-IcbzksbMhuM" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;IcbzksbMhuM&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/IcbzksbMhuM?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><div><hr></div><p></p><p>By the end of the four videos, you have a clean separation between frontend, backend, and LLM logic, all running in containers. This is the foundation the full bootcamp builds on.</p><p>The videos are designed to be accessible to anyone with basic programming knowledge. You don&#8217;t need to be a software engineer. If you&#8217;re a technical PM, data analyst, or someone moving into AI from an adjacent role, this is a practical starting point to understand how AI applications are built and deployed.</p><p>More videos are coming in the upcoming weeks. Subscribe to the channel so you don't miss them.</p><p></p><p style="text-align: center;">All code is available on GitHub:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/swirl-ai/ai-engineering-bootcamp-prerequisites&quot;,&quot;text&quot;:&quot;GitHub Repository&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/swirl-ai/ai-engineering-bootcamp-prerequisites"><span>GitHub Repository</span></a></p><p></p><p style="text-align: center;">The full bootcamp syllabus and registration:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://maven.com/swirl-ai/end-to-end-ai-engineering&quot;,&quot;text&quot;:&quot;End-to-End AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://maven.com/swirl-ai/end-to-end-ai-engineering"><span>End-to-End AI Engineering Bootcamp</span></a></p><p style="text-align: center;">Apply code <strong>LASTCHANCE15 </strong>at the check-out for 15% off.</p><p></p><h2><strong>Free Workshop: State of Context Engineering in 2026</strong></h2><p>On <strong>Friday, March 20</strong> I&#8217;m running a 45-minute workshop about context engineering.</p><p>Most agent failures come from poor context engineering, not weak model capability. Teams overload prompts with instructions, tools, and retrieved information. The result: brittle systems that are costly and difficult to scale.</p><p>In this session, I&#8217;ll cover:</p><p></p><ul><li><p><strong>Core patterns</strong> shaping context engineering in 2026: how teams give models the right information at the right time</p></li><li><p><strong>Tradeoffs</strong> between context strategies, compared across accuracy, latency, token cost, maintainability, and reliability</p></li><li><p><strong>Practical application</strong>: using routing, progressive disclosure, and modular capabilities to build agents that are more efficient and reliable</p></li></ul><p></p><p>If you&#8217;ve been following the recent newsletter on Agent Skills and progressive disclosure, this session connects directly to those ideas and goes broader.</p><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DpI5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DpI5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 424w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 848w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 1272w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DpI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png" width="477" height="477" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1456,&quot;width&quot;:1456,&quot;resizeWidth&quot;:477,&quot;bytes&quot;:1510760,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/191231389?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DpI5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 424w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 848w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 1272w, https://substackcdn.com/image/fetch/$s_!DpI5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F36e056e1-b9fe-4b47-9c6d-00d3dc3d8b78_3750x3750.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p style="text-align: center;">Register for the webinar here:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://maven.com/p/0bd8ae/state-of-context-engineering-in-2026&quot;,&quot;text&quot;:&quot;State of Context Engineering in 2026&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://maven.com/p/0bd8ae/state-of-context-engineering-in-2026"><span>State of Context Engineering in 2026</span></a></p><p style="text-align: center;"></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTkwNTE2MzA0LCJpYXQiOjE3NzM3NDQ0OTQsImV4cCI6MTc3NjMzNjQ5NCwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.kGwPQMyFPrW2_fhwzkZJ4MS3RcdIdKmd8ePW9ZTy6xw&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTkwNTE2MzA0LCJpYXQiOjE3NzM3NDQ0OTQsImV4cCI6MTc3NjMzNjQ5NCwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.kGwPQMyFPrW2_fhwzkZJ4MS3RcdIdKmd8ePW9ZTy6xw"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><p style="text-align: center;"></p>]]></content:encoded></item><item><title><![CDATA[Agent Skills: Progressive Disclosure as a System Design Pattern]]></title><description><![CDATA[A simple file format, powered by an architectural shift in how agents manage context.]]></description><link>https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Wed, 11 Mar 2026 10:38:30 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/f8b195c7-bc33-4bee-be52-984301bfdd43_4157x2209.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Everyone is talking about Agent Skills. On December 18, 2025, Anthropic released them as an open standard. Within weeks, OpenAI, Google, GitHub, and Cursor adopted it. Less than three months later, marketplaces like SkillsMP index over 400,000 skills across platforms.</p><p>If you look at a skill file, it&#8217;s very simple - a markdown file with some YAML at the top. That&#8217;s it.</p><p>So why did every major AI platform rush to support it?</p><p>The skill file is simple on purpose. It structures information into layers so that platforms like Claude Code, Codex CLI, and Gemini CLI can load context progressively: name and description first, full instructions when relevant, supporting materials only during execution. The file is the contract. The platform implements the interface. The design pattern driving both is <strong>progressive disclosure.</strong></p><p>Let&#8217;s unpack how this works.</p><p></p><h3>What Are Agent Skills?</h3><p>Before we get to why they matter, let&#8217;s look at what they actually are.</p><p>An Agent Skill is a directory containing a <code>SKILL.md</code> file. That file has two parts: YAML frontmatter (metadata) and Markdown body (instructions).</p><p>Here&#8217;s a real example from Anthropic&#8217;s official skills repository. This is the <code>pdf</code> skill:</p><pre><code><code>pdf/
&#9500;&#9472;&#9472; SKILL.md
&#9500;&#9472;&#9472; reference.md
&#9500;&#9472;&#9472; forms.md
&#9492;&#9472;&#9472; scripts/
    &#9500;&#9472;&#9472; check_bounding_boxes.py
    &#9500;&#9472;&#9472; check_fillable_fields.py
    &#9500;&#9472;&#9472; convert_pdf_to_images.py
    &#9500;&#9472;&#9472; create_validation_image.py
    &#9500;&#9472;&#9472; extract_form_field_info.py
    &#9500;&#9472;&#9472; extract_form_structure.py
    &#9500;&#9472;&#9472; fill_fillable_fields.py
    &#9492;&#9472;&#9472; fill_pdf_form_with_annotations.py
</code></code></pre><p>And the contents of <code>SKILL.md</code> file:</p><pre><code><code>---
name: pdf
description: Use this skill whenever the user wants to do anything with PDF files. This includes reading or extracting text/tables from PDFs, combining or merging multiple PDFs into one, splitting PDFs apart, rotating pages, adding watermarks, creating new PDFs, filling PDF forms, encrypting/decrypting PDFs, extracting images, and OCR on scanned PDFs to make them searchable. If the user mentions a .pdf file or asks to produce one, use this skill.
license: Proprietary. LICENSE.txt has complete terms
---

# PDF Processing Guide

## Overview

This guide covers essential PDF processing operations using Python libraries and command-line tools. For advanced features, JavaScript libraries, and detailed examples, see REFERENCE.md. If you need to fill out a PDF form, read FORMS.md and follow its instructions.

## Quick Start

...
</code></code></pre><p>Notice the directory structure. The <code>SKILL.md</code> file contains the skill&#8217;s metadata and instructions. The reference docs (<code>reference.md</code>, <code>forms.md</code>) provide supporting documentation. The <code>scripts/</code> folder holds 8 Python utilities the agent can call during execution.</p><p>Skills live in predictable locations. On Claude Code, that&#8217;s <code>~/.claude/skills/</code> for personal skills or <code>.claude/skills/</code> inside a project. Codex CLI uses <code>.agents/skills/</code>. Gemini CLI uses <code>.gemini/skills/</code>. The paths differ, but the format is identical across all of them.</p><p>The key distinction from system prompts or custom instructions: skills are <strong>modular and selectively loaded.</strong> A system prompt is always on. A skill sits dormant until the platform decides it&#8217;s relevant to the current task. The file organizes information into layers. The platform decides when to load each layer.</p><p></p><h3>The Problem: Context Windows Are Not Free</h3><p>Best practice recommends fewer than 20 tools available to an agent at once, with accuracy degrading past 10. The same principle applies to instructions. An agent handling customer support, billing, refunds, and onboarding doesn&#8217;t need all four workflow guides loaded when the user asks about a refund. Connect a few MCP servers without managing what loads, and you quickly reach 90+ tool definitions, over 50,000 tokens of JSON schemas before the model even starts reasoning. Layer in system prompts, workflow instructions, conversation history, and retrieved documents on top.</p><p>As context grows, the model's attention degrades and important information gets buried. Models reliably miss information placed in the middle of long contexts, a well-documented phenomenon called "lost-in-the-middle." The more irrelevant context surrounds the relevant pieces, the worse retrieval accuracy gets.</p><p></p><div><hr></div><p>Join the March Cohort of my End-to-End AI Engineering Bootcamp to learn how to solve the challenges of Context Engineering for production systems in the real world. (Use code <strong>KICKOFF15 </strong>at the checkout for 15% discount).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p></p><h3>Progressive Disclosure and the Three-Tier Architecture</h3><p>Progressive disclosure is a well-established design pattern, formalized by the Nielsen Norman Group for user interface design. The principle: show only what is needed for the immediate task and defer everything else. Advanced options live behind a click, rarely-used features hide in secondary menus. This reduces cognitive load, lowers error rates, and makes systems usable by a wider audience.</p><p>This is part of a broader trend: design patterns built for human cognition transfer well to agents. Agent memory systems already mirror human memory, separating short-term working memory from long-term storage. Progressive disclosure follows the same logic. The context window is the agent&#8217;s cognitive space. Overloading it degrades performance, while keeping it focused lets the agent reason sharply. Agent Skills apply this principle to how agents access knowledge.</p><p>I have written more about context engineering and memory management problems here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;7816c0f6-0c66-433a-a696-1510689be2d0&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Breaking Down Context Engineering&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-08-30T07:01:29.799Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0ff8da46-fa9f-4589-89ac-46d5ed749787_836x568.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/breaking-down-context-engineering&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:171633835,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:50,&quot;comment_count&quot;:0,&quot;publication_id&quot;:1144171,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>The <code>SKILL.md</code> file organizes information into three layers. The platform implements the loading logic, deciding when to promote from one layer to the next.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l0bz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l0bz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 424w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 848w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 1272w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l0bz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png" width="1456" height="1005" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2cdb8c1b-0907-442e-aa74-d0f31165009e_4157x2868.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1005,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:445502,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/190516304?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2cdb8c1b-0907-442e-aa74-d0f31165009e_4157x2868.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l0bz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 424w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 848w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 1272w, https://substackcdn.com/image/fetch/$s_!l0bz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F462b24c3-fe63-4942-ad5e-b3418bcfa034_4157x2868.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Three-Tier Architecture</figcaption></figure></div><p></p><h4>Layer 1: Discovery</h4><p>At startup, the platform reads only the skill's name and one-line description from the YAML frontmatter. I measured this across Anthropic's 17 official skills: the median discovery cost is ~80 tokens per skill, ranging from ~55 (webapp-testing) to ~235 (xlsx). All 17 skills together cost ~1,700 tokens, meaning an agent can be aware of dozens of skills for less context than a single activated skill.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HuRj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HuRj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HuRj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9f7a2b7d-2c90-4b3a-b327-4f4d3f4ee73b_5911x3352.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1907994,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/190516304?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9f7a2b7d-2c90-4b3a-b327-4f4d3f4ee73b_5911x3352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HuRj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!HuRj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F84f393c7-009a-463e-b5ba-d8937340aeb8_5911x3352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agent Skill Discovery</figcaption></figure></div><p></p><h4>Layer 2: Activation</h4><p>When the platform determines a skill is relevant to the current task, it loads the full <code>SKILL.md</code> markdown body into context. I counted tokens across all 17 skills in Anthropic&#8217;s official repository: body size ranges from ~275 tokens (internal-comms) to ~8,000 tokens (skill-creator), with a median around 2,000.</p><p>The platform makes this decision using LLM reasoning over the descriptions from the discovery layer. Research shows that Claude selects skills through pure reasoning, with description quality directly determining routing accuracy.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jv6k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jv6k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jv6k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/03647c94-d99f-448b-bff5-c7b19ba67771_5911x3352.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1911688,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/190516304?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F03647c94-d99f-448b-bff5-c7b19ba67771_5911x3352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jv6k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!Jv6k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6724bdbf-f331-495f-a20d-3124cf245ed5_5911x3352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agent Skill Activation</figcaption></figure></div><p></p><h4>Layer 3: Execution</h4><p>During execution, the platform pulls in supporting materials on demand: scripts, reference documentation, templates, configuration files. These only enter context when the agent reaches a step that requires them.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!645g!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!645g!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!645g!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!645g!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!645g!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!645g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png" width="1456" height="826" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:826,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1449250,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/190516304?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!645g!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 424w, https://substackcdn.com/image/fetch/$s_!645g!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 848w, https://substackcdn.com/image/fetch/$s_!645g!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 1272w, https://substackcdn.com/image/fetch/$s_!645g!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1a339d53-5c62-4900-9575-4169fcb795c6_5911x3352.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agent Skill Execution</figcaption></figure></div><p>Here&#8217;s how this works in practice, using Anthropic&#8217;s PDF skill as an example.</p><ol><li><p>The <code>SKILL.md</code> body contains pointers to execution-layer files:</p></li></ol><pre><code><code>## Next Steps
- For advanced pypdfium2 usage, see REFERENCE.md
- For JavaScript libraries (pdf-lib), see REFERENCE.md
- If you need to fill out a PDF form, follow the instructions in FORMS.md
- For troubleshooting guides, see REFERENCE.md
</code></code></pre><p>When the agent hits a step that needs this context, it pulls in the referenced file. The content it finds falls into three categories:</p><ol start="2"><li><p><strong>Domain knowledge.</strong> <code>REFERENCE.md</code> contains technical context the agent needs to make informed decisions:</p></li></ol><pre><code><code>### Overview
pypdfium2 is a Python binding for PDFium (Chromium's PDF library).
It's excellent for fast PDF rendering, image generation,
and serves as a PyMuPDF replacement.
</code></code></pre><ol start="3"><li><p><strong>Executable scripts.</strong> The same file also includes code the agent can run directly:</p></li></ol><pre><code><code>import pypdfium2 as pdfium
from PIL import Image

pdf = pdfium.PdfDocument("document.pdf")
page = pdf[0]
bitmap = page.render(scale=2.0, rotation=0)

img = bitmap.to_pil()
img.save("page_1.png", "PNG")
</code></code></pre><ol start="4"><li><p><strong>Tool pointers.</strong> Some files reference standalone scripts as tools. From <code>FORMS.md</code>:</p></li></ol><pre><code><code>If you need to fill out a PDF form, first check to see if the PDF
has fillable form fields. Run this script from this file's directory:
`python scripts/check_fillable_fields &lt;file.pdf&gt;`, and depending on
the result go to either the "Fillable fields" or "Non-fillable fields"
and follow those instructions.</code></code></pre><p></p><p>Each layer adds context, and none of it loads until it's needed. The unloading side matters too. A naive implementation would discard a skill's context entirely after use, only to reload it minutes later when the next related task arrives. Smarter implementations cache recently used skills or keep frequently activated skills warm, balancing context efficiency with the cost of repeated loading.</p><p></p><h3>This Is Not Just About Coding Agents</h3><p>Most Agent Skills adoption today is in developer tools like Claude Code, Codex CLI, Cursor, and Gemini CLI. The pattern generalizes well beyond that.</p><p><strong>OpenClaw</strong> is the clearest example. It&#8217;s an open-source autonomous agent that passed 175K GitHub stars in under two weeks, and while it can code, its adoption is driven by non-coding use cases: managing calendars, drafting emails, controlling smart home devices, meal planning in Notion, coordinating across WhatsApp and Telegram. Its community registry, ClawHub, hosts over 13,000 skills, most of them non-technical.</p><p>The three-tier pattern works anywhere you need broad capability with focused execution. Customer support agents that know about 200 product features but only discuss 2 per conversation. Internal operations agents managing dozens of workflows. Research agents navigating large knowledge bases. Progressive disclosure is a <strong>system design pattern.</strong></p><p>This is where the responsibility falls on us as AI Engineers. Coding agent platforms like Claude Code and Cursor already implement the progressive disclosure interface. When we build non-coding agents, customer support bots, internal operations tools, domain-specific assistants, we need to build that same interface ourselves. The <code>SKILL.md</code> contract gives us the structure. Implementing the loading logic, the discovery-to-activation-to-execution pipeline, that&#8217;s our job.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Agent Behavior, Accessible to Non-Technical People</h3><p>ChatGPT made conversing with AI accessible to everyone. Skills do the same for configuring how agents behave. A skill file is markdown with plain English instructions. A product manager can open one and understand what it does. A domain expert can edit one. A non-technical team lead can create one from scratch, or ask an AI to generate it using a skill-creator skill. Both Anthropic and Google ship built-in skill-creators that generate skill files from natural language descriptions.</p><p>On Claude.ai, non-developers can enable pre-built skills in settings, upload custom skill packages as ZIP files, and let Claude select relevant skills automatically. Configuring agent behavior used to require prompt engineering expertise or developer access. Now the people closest to the problem, domain experts, team leads, operators, can directly shape how agents behave.</p><p>Skill marketplaces like SkillsMP are already forming with a distribution model similar to browser extensions: discover, install, configure.</p><p></p><h3>The Ecosystem Moved Fast for a Reason</h3><p>Anthropic released the Agent Skills open standard on December 18, 2025.</p><p>Within weeks:</p><ul><li><p><strong>OpenAI</strong> adopted it for Codex CLI and ChatGPT.</p></li><li><p><strong>Google</strong> added skills to Gemini CLI.</p></li><li><p><strong>GitHub Copilot</strong> launched skills support the same day as the standard.</p></li><li><p><strong>Cursor</strong> integrated skills alongside their existing Rules system.</p></li></ul><p>This speed tells you something. Every one of these platforms faces the same two problems: how to give agents broad knowledge without destroying context quality, and how to let users configure agent behavior without requiring engineering expertise. The skills format solves both. Progressive disclosure keeps context lean, and the markdown-based contract makes skills accessible to anyone who can write plain English.</p><p></p><h3>Wrapping Up</h3><p>Agent Skills are the first mainstream implementation of progressive disclosure applied to agent context management. The pattern is simple: give agents a lightweight index of capabilities, pull in details when needed, and keep context lean.</p><p>Context efficiency is only half the value. Skills also make agent behavior configurable by anyone who can write plain English, the same way ChatGPT made conversing with AI accessible to everyone.</p><p>Coding agent platforms already implement the progressive disclosure interface. When we build non-coding agents, the responsibility falls on us as AI Engineers to implement the same three-tier loading pattern, with smart caching to avoid naive context churn.</p><p>The <code>SKILL.md</code> file defines the contract. The platforms implement the interface. Progressive disclosure as a system design principle is what ties them together. This pattern will outlast the specific file format and show up wherever agents need to choose from many options without drowning in information.</p><p></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/agent-skills-progressive-disclosure?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Learning AI Engineering in 2025]]></title><description><![CDATA[My Reflections and Tips.]]></description><link>https://www.newsletter.swirlai.com/p/learning-ai-engineering-in-2025</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/learning-ai-engineering-in-2025</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 06 Sep 2025 08:35:48 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/18ccd2cf-e975-4671-879e-32b58dd349dd_1368x994.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Few months ago I took my mission to teach AI to the next level - June 23rd marked the start of the first cohort of my End-to-End AI Engineering Bootcamp. Last week was the time to wrap up and reflect on the whole experience.</p><p>When I first launched the program, the goal was simple - bridge the gap between just knowing AI concepts and actually building, deploying, evaluating and engineering production ready AI systems end-to-end.</p><p>Now, after weeks of working alongside ambitious learners from around the world, I can confidently say that the experience was a lot more than I expected:</p><ul><li><p>First of all, the goal set was to the point - hands-on learning in a cohort based environment is how you succeed in learning AI Engineering.</p></li><li><p>The participants were active beyond my expectations - each of the live sessions would become a club where live discussions would uncover unknown unknowns elevating the learning experience to new levels.</p></li><li><p>Full dedication from my side - I&#8217;ve spent countless nights in order to deliver according to the quality bar I have raised for myself.</p></li><li><p>Students were delivering week after week - capstone submissions were flowing in even if sometimes it required more time than initially thought.</p></li></ul><p></p><p>My personal numbers:</p><ul><li><p>40 hours of live lectures delivered.</p></li><li><p>30 hours of hands-on coding recorded and available for offline consumption.</p></li><li><p>250 pages written, available as offline reading material.</p></li></ul><p></p><p>Success story of the cohort:</p><ul><li><p>One of the capstone projects was sold even before the demo day!</p></li></ul><p></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;Join Next Cohort&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-bootcamp"><span>Join Next Cohort</span></a></p><p>Use code <strong>KICKOFF10</strong> at the check-out for a 10% discount.</p><div><hr></div><p></p><p>Personally, I know I gave my all to the bootcamp. However, because it was the first cohort shaping the future of the program, I never expected this outcome:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NnzD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NnzD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 424w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 848w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 1272w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NnzD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png" width="1103" height="284" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:284,&quot;width&quot;:1103,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:67031,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NnzD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 424w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 848w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 1272w, https://substackcdn.com/image/fetch/$s_!NnzD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2c46855b-ed08-45ce-af6b-542bc016d85e_1103x284.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div><hr></div><p>Here are some reviews that highlight student experience:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mCF-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mCF-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 424w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 848w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 1272w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mCF-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png" width="728" height="495.4276206322795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:818,&quot;width&quot;:1202,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:197412,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mCF-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 424w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 848w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 1272w, https://substackcdn.com/image/fetch/$s_!mCF-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c462b84-f90a-45d3-8ec6-f781a5de52b4_1202x818.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iMif!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iMif!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 424w, https://substackcdn.com/image/fetch/$s_!iMif!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 848w, https://substackcdn.com/image/fetch/$s_!iMif!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!iMif!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iMif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png" width="1210" height="1380" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1380,&quot;width&quot;:1210,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:377112,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iMif!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 424w, https://substackcdn.com/image/fetch/$s_!iMif!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 848w, https://substackcdn.com/image/fetch/$s_!iMif!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 1272w, https://substackcdn.com/image/fetch/$s_!iMif!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15f90a15-ca88-42d8-a326-1da0c2290954_1210x1380.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QIJs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QIJs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 424w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 848w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 1272w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QIJs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png" width="1212" height="860" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:860,&quot;width&quot;:1212,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:230142,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QIJs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 424w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 848w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 1272w, https://substackcdn.com/image/fetch/$s_!QIJs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe405fcd7-d0ae-4455-b198-8e16dc4cc0bd_1212x860.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GNEx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GNEx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 424w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 848w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 1272w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GNEx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png" width="1202" height="988" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:988,&quot;width&quot;:1202,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:259623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GNEx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 424w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 848w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 1272w, https://substackcdn.com/image/fetch/$s_!GNEx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa850f350-615a-4e3f-808c-eda3a67ccc58_1202x988.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Equally important to me personally is the following acknowledgement. It reassures me that what I am doing is on the right path.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Wu2o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Wu2o!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 424w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 848w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 1272w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Wu2o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png" width="602" height="221.95478260869567" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/764444b5-556d-44a1-bc28-1face2e419b7_575x212.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:212,&quot;width&quot;:575,&quot;resizeWidth&quot;:602,&quot;bytes&quot;:51968,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/172870842?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Wu2o!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 424w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 848w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 1272w, https://substackcdn.com/image/fetch/$s_!Wu2o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F764444b5-556d-44a1-bc28-1face2e419b7_575x212.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>This also gives me confidence that the course is among the best of what you can get on Maven when it comes to hands-on End-to-End AI Engineering.</p><p>The feedback I received throughout the bootcamp will be invaluable for the evolution of the program and making sure that the next cohort will bring even more value to the students.</p><p></p><h3>What&#8217;s new in the upcoming cohort.</h3><p></p><h4>Syllabus.</h4><p></p><p>I have plenty of feedback and my own reflections that will help me bring the bootcamp to the next level. Some of the points that I will be focusing on even more in the upcoming cohort:</p><ul><li><p><strong>Even deeper focus on Evals:</strong> you will get the value of dedicated eval courses and more.</p></li><li><p><strong>Context Engineering hands-on deep dives</strong> for different levels of Agentic System complexity.</p></li><li><p><strong>Additional project</strong> that I code up alongside your learning journey.</p></li><li><p>&#8230;</p></li></ul><p></p><h4>New Partnerships.</h4><p></p><p>I have already closed a partnership with Modal for the next cohort:</p><ul><li><p>Each student will <strong>receive $500 in Modal credits</strong> for cloud compute and deployment.</p></li></ul><p>Potentially, additional partnerships will happen.</p><p></p><h4>Guest Lectures.</h4><p></p><p>The cohort will have a line-up of four or more guest lecturers that will bring unique insights into dedicated parts of the End-to-End AI Engineering process. </p><ul><li><p>This is an addition to all of the materials and sessions delivered by me.</p></li><li><p>List of speakers TBD.</p></li></ul><p></p><h3>Who is this course for.</h3><p></p><p>You will get most out of the bootcamp if you are:</p><ul><li><p>Data Scientist / ML Engineer looking to re-skill into AI Engineering.</p></li><li><p>A founder looking to build AI Native products.</p></li><li><p>Engineering Leader looking to up-skill your teams.</p></li><li><p>Software Engineer looking to learn and implement non deterministic LLM based systems.</p></li></ul><p></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;Reserve Your Spot&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-bootcamp"><span>Reserve Your Spot</span></a></p><p>Use code <strong>KICKOFF10</strong> at the check-out for a 10% discount.</p><div><hr></div><p></p><p>I also run dedicated cohorts and workshops, so if your company is looking to upskill its talent or tackle business challenges with AI, feel free to contact me at aurimas@swirlai.com .</p><p></p><h3>Price changes to the future cohorts</h3><p></p><p>This will be the last time I offer the cohort at this price. Future cohorts will be priced higher due to several factors:</p><ul><li><p>The curriculum is already more advanced than what&#8217;s available elsewhere, and it will continue to expand and improve.</p></li><li><p>Demand for the program is strong.</p></li><li><p>Enrolling now provides lifetime access to materials from all future cohorts.</p></li></ul><p></p><h3>Closing Note</h3><p></p><p>I am incredibly grateful to everyone who joined the first cohort and helped shape this program. Cohort Beta is going to be even stronger - I&#8217;d love for you to be part of it. </p><p><em>P.S. I&#8217;ll be running a free video lesson series on my youtube channel. This will be part of onboarding to the bootcamp. Stay tuned!</em></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/learning-ai-engineering-in-2025?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/learning-ai-engineering-in-2025?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p></p><p></p><p></p>]]></content:encoded></item><item><title><![CDATA[Breaking Down Context Engineering]]></title><description><![CDATA[All you ned to know about the challenges surrounding the practice.]]></description><link>https://www.newsletter.swirlai.com/p/breaking-down-context-engineering</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/breaking-down-context-engineering</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 30 Aug 2025 07:01:29 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/0ff8da46-fa9f-4589-89ac-46d5ed749787_836x568.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>The topic that has been dominating headlines in AI Agent world for the past months - Context Engineering. In this article I will outline my thoughts on the topic and what I have observed while building Agentic Systems for the past few years. Also, we will look closer into all types of Context that Agentic Systems rely on and the challenges that come with managing it.</p><p>For Engineers who have been building AI Agents, Context Engineering is a new name but not a new practice - it is how we forced Agents to perform the work we expect to be done in the first place. When many were positioning Prompt Engineering as the hottest carrier path in the upcoming years I was not surprised, to me, the &#8220;old school&#8221; Prompt Engineering is a subset of Context Engineering.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YNNm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YNNm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 424w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 848w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 1272w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YNNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png" width="554" height="317.94141145139815" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1ad0da75-c0dd-4e4e-a0a7-26d70497d59b_751x431.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:431,&quot;width&quot;:751,&quot;resizeWidth&quot;:554,&quot;bytes&quot;:20192,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1ad0da75-c0dd-4e4e-a0a7-26d70497d59b_751x431.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YNNm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 424w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 848w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 1272w, https://substackcdn.com/image/fetch/$s_!YNNm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F31c14cee-54eb-493e-8e49-ecbd9e085698_751x431.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Prompt Engineering - a subset of Context Engineering</figcaption></figure></div><p></p><h3>What is Context Engineering and why it is important.</h3><p></p><p>Let&#8217;s quickly remember what an Agentic System is.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wob6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wob6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 424w, https://substackcdn.com/image/fetch/$s_!wob6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 848w, https://substackcdn.com/image/fetch/$s_!wob6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 1272w, https://substackcdn.com/image/fetch/$s_!wob6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wob6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png" width="1456" height="519" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/349db884-e150-438e-8d6a-14207d841adf_1805x643.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9ea7169d-6b8c-4b8c-851e-e735c34d33fa_1805x643.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:519,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:131194,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9ea7169d-6b8c-4b8c-851e-e735c34d33fa_1805x643.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wob6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 424w, https://substackcdn.com/image/fetch/$s_!wob6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 848w, https://substackcdn.com/image/fetch/$s_!wob6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 1272w, https://substackcdn.com/image/fetch/$s_!wob6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F349db884-e150-438e-8d6a-14207d841adf_1805x643.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A simplified Agentic System Topology.</figcaption></figure></div><p>In simple terms, it is a topology of LLM calls connected via different patterns. It could be an iteration loop or a more sequential pattern. But the main thing to take out from this is that each output of an LLM node influences the downstream system.</p><p>It is clear that quality of the agent is only as good as the context you pass to it via prompts fed to LLM at each step. One could think that ever expanding Context Size limits of the LLMs will be the solution, but there are many inherent problems in pushing as much data to prompts as possible. Few ways the Model can be confused due to too much information have been documented:</p><ul><li><p><strong>Context Poisoning:</strong> When a hallucination makes it into the context</p></li><li><p><strong>Context Distraction:</strong> When the context overwhelms the training</p></li><li><p><strong>Context Confusion:</strong> When superfluous context influences the response</p></li><li><p><strong>Context Clash:</strong> When parts of the context disagreeContext window size.</p></li></ul><p></p><p>This is where Context Engineering comes in. It is a practice of trying to provide the minimal amount of focused context to the specific Agentic System step so that it is able to perform work it was designed to.</p><p>There are multiple types of context that exists and needs to be managed and orchestrated in modern Agentic Systems. Bellow is an image listing these different types.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!I3_k!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!I3_k!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 424w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 848w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 1272w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!I3_k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png" width="630" height="361.55792276964047" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e6bab1cb-10af-4997-8ce1-abad10a97f16_751x431.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:431,&quot;width&quot;:751,&quot;resizeWidth&quot;:630,&quot;bytes&quot;:36604,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe6bab1cb-10af-4997-8ce1-abad10a97f16_751x431.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!I3_k!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 424w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 848w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 1272w, https://substackcdn.com/image/fetch/$s_!I3_k!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4494525c-bf33-4cd8-8f26-c09d7c91e1d8_751x431.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Different types of Context</figcaption></figure></div><p>Let&#8217;s look into each part of the context separately and analyse what kind of challenges we need to solve when engineering/orchestrating it.</p><p></p><div><hr></div><p>Join the September Cohort of my End-to-End AI Engineering Bootcamp to learn how to solve the challenges of Context Engineering for production systems in the real world. (Use code <strong>KICKOFF10 </strong>at the checkout for 10% discount).</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p></p><h4>Instructions / System Prompt.</h4><p></p><p><strong>Where it fits:</strong> System instructions (or the system prompt) are rails that define LLMs outputs on a more general level. This is a special prompt that defines the role, behavior, and boundaries of the LLM outputs before any user input is passed to the LLM later. These instructions set the overall tone, personality, scope and response style for the model. They act as policy, ensuring the LLM follows certain rules regardless of what the user asks.</p><p></p><p><strong>Challenges.</strong></p><ul><li><p>Alignment - the system prompt must cover all the necessary guidance (from factuality checks to style guidelines) within a limited space. Too little can make the model may go off-track, too much might constrain useful behavior.</p></li><li><p>Risk of prompt injection - a malicious user input can override or contradict the system instructions. Current LLMs don&#8217;t perfectly separate system instructions from user text. Attackers may embed conflicting instructions that cause the model to ignore the original rules.</p></li><li><p>System context can conflict with user needs - if the user asks for something outside the allowed scope, the LLM must refuse per system instructions. Maintaining helpfulness with these constraints is a hard challenge.</p></li></ul><p></p><h4>User Prompt.</h4><p></p><p><strong>Where it fits:</strong> The user prompt is the direct request or query provided by the user, the immediate task the user wants done. This part of context was the main focus of prompt engineering - developers tweaked wording and details to try and get better answers from LLMs.</p><p></p><p><strong>Challenges.</strong></p><ul><li><p>Real use cases often involve multi-turn conversations/interactions, follow-up questions or the need for external information.</p></li><li><p>A single user query might require multipple steps to be answered. E.g. searching for information, then summarizing, then formatting an answer. Such task can&#8217;t be solved by a single Agent turn, it requires multipple turns in sequence. Check out the Deep Research Agent as an example:</p></li></ul><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;4cd19d42-0c1a-4e51-92c0-174e29a5745a&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building Deep Research Agent from scratch&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-03-11T08:03:16.246Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d70c1604-8ab3-4476-8ef6-f8f8f5a4eadd_3040x2289.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/building-deep-research-agent-from&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:157875435,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:101,&quot;comment_count&quot;:6,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!JA65!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><ul><li><p>Context engineering addresses this by breaking complex user requests into smaller sub-tasks or chaining prompts. The challenge here is understanding the user&#8217;s true intent and managing the agents flow to solve that intent.</p></li><li><p>Ensuring that the user&#8217;s request is fully understood and fulfilled over multiple turns requires a lot f iteration and comprehensive eval suites covering multiple complex interaction examples.</p></li></ul><p>In short, the user prompt is just the starting point. The context engineering needed to handle intent solution is what goes next.</p><p></p><h4>Retrieved Context.</h4><p></p><p><strong>Where it fits:</strong> Retrieved context refers to additional information pulled in from outside sources to help answer the user&#8217;s query. This is the main concept on which Retrieval Augmented Generation (RAG) systems are based. When a question requires facts or data not already in the LLM&#8217;s parametric knowledge, the system would fetch relevant context from databases, documents, web or other APIs. Once we retrieve the data we inject it to the prompt that is given to the model. This external information becomes part of the model&#8217;s context for that turn.</p><p></p><p><strong>Challenges.</strong></p><ul><li><p>The primary challenge with retrieved context is finding and choosing the right information. In a large enterprise knowledge base or internet-scale data, there may be thousands of snippets that could be related.</p></li><li><p>The system has a limited context window (a maximum number of tokens it can attend to, and also the previously mentioned confusion modes) so we must be selective about what we include there. We need to solve an optimisation problem - from N candidate pieces of data, pick the ones that matter most for this query. Retrieving too little will leave out key facts, retrieving too much will produce noise.</p></li><li><p>Mitigating these issues requires hybrid retrieval and clever ranking of retrieved snippets, filtering out low-quality or conflicting data, and sometimes summarising or compressing the retrieved text.</p></li><li><p>Properly preparing the corpus for retrieval is also a huge Data Engineering challenge as you need to carefully preprocess each of the chunk before embedding so that each of the chunks does not miss relevant surrounding context and more global metadata.</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>State (short-term memory).</h4><p></p><p><strong>Where it fits:</strong> There is a blurred line between State and conversation history - the recent messages in the current session. The System State serves as the model&#8217;s working memory of the latest state available. Part of this state is also the conversation history. It does not mean though that you always pass the entire conversation history or other elements of the state each time you invoke an LLM as part of the system execution. Context Engineering as a lot about how you manage this evolving state and what minimal parts of the state are passed to LLMs in trying to make them perform exactly the tasks they are meant to perform.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wuZh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wuZh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 424w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 848w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wuZh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/15e5687c-140f-4a5a-ad12-11ef6c1a745a_3709x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:329960,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F15e5687c-140f-4a5a-ad12-11ef6c1a745a_3709x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wuZh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 424w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 848w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!wuZh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe14425b9-4b56-4245-ba21-a2a75bd4701a_3709x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Expanding Conversation History.</figcaption></figure></div><p><strong>Challenges.</strong></p><ul><li><p>Obvious limitation is that the context window is finite. In long conversations or long-running AI agent tasks, the amount of conversation can exceed the size of the context window, forcing the system to drop or compress older messages. The challenge is deciding what to remember and what to forget. If you append the entire conversation every time, you will hit token limits. If you forget too aggressively, the model will lose important details and repeat past questions and tool calls.</p></li><li><p>Context engineering at this layer involves techniques like summarisation, clipping and context caching. E.g. summarising earlier parts of the conversation and only keeping the summary or selectively carrying forward key facts </p></li><li><p>Another issue is context drift - over many turns, if the model&#8217;s memory of early instructions or facts gets too diluted, it may start to deviate from the original intent. We need to continuously ensure the our Agents stays on track (sometimes by re-injecting key instructions into the context if they were dropped).</p></li></ul><p></p><h4>Long-Term Memory.</h4><p></p><p><strong>Where it fits:</strong> Long-term memory is a mechanism for Agents to retain information beyond the current session. This can include data like user&#8217;s preferences and profile, facts learned in past conversations, or any data that should persist over time. Implementations of long-term memory will vary - from simple databases of past Q&amp;A pairs to vector embeddings of conversation transcripts that can be searched on demand. The key is that this memory is persistent and can be pulled into context when relevant.</p><p></p><p><strong>Challenges.</strong></p><ul><li><p>Relevance and retrieval - given months of accumulated interactions or facts, how do we fetch only the pieces that matter for the current conversation? Similar like with retrieved context, it is not viable to load everything - the system needs to search and inject the most relevant memories. You might have hundreds of things the Agent could recall about a user, but only a couple are useful in answering the current question.</p></li><li><p>Consistency and updating - long-term memory can become a liability if not managed well - e.g. the user&#8217;s situation may change (they get a new job) and the memory store must be updated or old info deprecated. Stale or incorrect memories can lead to responses that could potentially breach trust.</p></li><li><p>Privacy and security - storing personal data long-term requires safeguards so that sensitive info is not unintentionally exposed.</p></li><li><p>Integrating memory into context without confusion - if our system pulls in a summary of your past chats, that summary becomes part of the prompt which could potentially conflict with other instructions or overwhelm the model if too verbose.</p></li></ul><p></p><p>The connection between Long and Short-Term memory can be visualised as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xiZx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xiZx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 424w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 848w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 1272w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xiZx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png" width="1456" height="1081" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1e63b44a-4f89-4a4f-b343-35d48b71e0c2_2481x1842.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1081,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:477718,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1e63b44a-4f89-4a4f-b343-35d48b71e0c2_2481x1842.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xiZx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 424w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 848w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 1272w, https://substackcdn.com/image/fetch/$s_!xiZx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3246b402-4637-4782-a935-cad801c4d1c5_2481x1842.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Short and Long-Term memory.</figcaption></figure></div><ol><li><p><strong>Episodic</strong> - This type of memory contains past interactions and actions performed by the agent. After an action is taken, the application controlling the agent would store the action in some kind of persistent storage so that it can be retrieved later if needed. A good example would be using a vector Database to store semantic meaning of the interactions.</p></li><li><p><strong>Semantic</strong> - Any external information that is available to the agent and any knowledge the agent should have about itself. You can think of this as a context similar to one used in RAG applications. It can be internal knowledge only available to the agent or a grounding context to isolate part of the internet scale data for more accurate answers.</p></li><li><p><strong>Procedural</strong> - This is systemic information like the structure of the System Prompt, available tools, guardrails etc. It will usually be stored in Git, Prompt and Tool Registries.</p></li><li><p>Occasionally, the agent application would pull information from long-term memory and store it locally if it is needed for the task at hand.</p></li><li><p>All of the information pulled together from the long-term or stored in local memory is called short-term or working memory. Compiling all of it into a prompt will produce the prompt to be passed to the LLM and it will provide further actions to be taken by the system.</p></li></ol><p>We usually label 1. - 3. as Long-Term memory and 5. as Short-Term memory.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Tools.</h4><p></p><p><strong>Where it fits:</strong> Most advanced AI systems would use external tools - for example, calling APIs, running code, searching the web or simply performing Retrieval via a predefined function. This extends the model&#8217;s capabilities by letting it act in the environment or fetch information dynamically. In terms of context, whenever we expose a tool to the LLM, we need to take care of two aspects:</p><ol><li><p><strong>Tool definitions</strong> - the instructions or documentation describing what tools exist and how to use them (often provided as part of the system prompt).</p></li><li><p><strong>Tool outputs</strong> - the results returned by the tool, which the model should incorporate into its next response.</p></li></ol><p>Vanilla Tool use implementation can be described as follows:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ouvp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ouvp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 424w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 848w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 1272w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ouvp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png" width="1456" height="825" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3aa1c299-2d5e-4932-8576-70d159846b05_2106x1194.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:825,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:165589,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3aa1c299-2d5e-4932-8576-70d159846b05_2106x1194.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ouvp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 424w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 848w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 1272w, https://substackcdn.com/image/fetch/$s_!ouvp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa7563b31-a08e-43ba-8e1f-163660778ca0_2106x1194.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tools via Native Function Calling.</figcaption></figure></div><ol><li><p>User Query is passed to the Agent (usually a Python application).</p></li><li><p>All of the available Functions/Tools are defined as part of the Agent code (procedural memory).</p></li><li><p>The list of available Tools is passed together with the User Query to a LLM via a prompt. The LLM figures out which functions need to be invoked and with what parameters.</p></li><li><p>The Agent application directly executes the functions.</p></li><li><p>User Query is sent to the LLM together with the data retrieved after function execution.</p></li><li><p>The answer is constructed and returned to the user via the Agent.</p></li></ol><p></p><p><strong>Challenges.</strong></p><ul><li><p>The model must understand when and how to suggest a tool invocation. This often requires carefully prompting the model with instructions like:</p><pre><code>When making tool calls, use this exact format:
{
    "name": "tool_name",
    "arguments": {
        "parameter1": "value1",
        "parameter2": "value2",
    }
}

CRITICAL: All parameters must go inside the "arguments" object, not at the top level of the tool call.

Examples:
- Remove item from shopping cart:
{
    "name": "remove_from_shopping_cart",
    "arguments": {
        "product_id": "123",
        "user_id": "123",
        "cart_id": "456"
    }
}</code></pre></li><li><p>Once a tool is used, the output needs to be integrated seamlessly. If the tool returns a lot of data, including that in the context can quickly hit context size limits. Often we need to post-process tool outputs, e.g. extracting parts that are relevant, summarising etc. before appending it to the context.</p></li><li><p>State maintenance between tool calls and the model&#8217;s reasoning. Sometimes Agent architectures implement patterns similar to ReAct where the model iteratively reasons, calls a tool, analyses the result and continues. Ensuring that the model&#8217;s chain of reasoning is properly preserved across these steps is not trivial. Especially when you also need to keep the track of the original user query.</p></li><li><p>Error handling - tools can fail or return unexpected results. In those cases we need to stitch safety nets around the non-deterministic part of the system (i.e. guide the model on how it should respond in the case of such errors, e.g. apologise and exit the system).</p></li><li><p>From a security standpoint, tool use carries risks in context. E.g. if the System reads from a URL, the page might contain malicious text (prompt injection via tool output) or sensitive data that shouldn&#8217;t be revealed. Context engineering for tools also means guard-railing such cases and properly mitigating the risks.</p></li></ul><p></p><h4>Structured Output.</h4><p></p><p><strong>Where it fits:</strong> In most agentic applications, it is not enough for the LLM to respond with free-form text since the output of one LLM call can be an input to the next LLM call, hence we often need the answer in a particular format or structure. This could be a JSON object, an HTML snippet or a SQL query. Context for ensuring correct structured outputs refers to any instructions or mechanisms that enforce a certain format for the model&#8217;s response. In most cases you would define these rules as part of the System Prompt, on top of the instructions you should use frameworks like Instructor that help in making sure that structured outputs are properly handled.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ctrM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ctrM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 424w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 848w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 1272w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ctrM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png" width="1456" height="1030" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4f8e0faf-683b-4a9b-bed6-12563fbe2d68_2054x1453.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1030,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:211872,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/171633835?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4f8e0faf-683b-4a9b-bed6-12563fbe2d68_2054x1453.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ctrM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 424w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 848w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 1272w, https://substackcdn.com/image/fetch/$s_!ctrM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F883f06b6-9b7c-4e2c-9fa8-4ac84edbde7e_2054x1453.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Structured Outputs ensure Validated Inputs for downstream tasks</figcaption></figure></div><p><strong>Challenges.</strong></p><ul><li><p>Anyone who has tried to get LLM to output perfectly structured data knows how frustrating it can be. Models often hallucinate formatting or include extra explanations despite instructions.</p></li><li><p>A major challenge here is reliability - how to guarantee the output is formatted exactly as it is needed (no missing brackets, no additional text). While there are ways to minimise inconsistencies (like using OpenAI&#8217;s function calling and similar APIs that allow developers to specify a JSON schema or grammar that the model must adhere to, constraining the model&#8217;s decoding) not all systems support that.</p></li><li><p>The good news is that tooling is improving - validators, format enforcement and better prompt techniques. A good example is the Instructor library that I personally use in most of my projects. It enforces the outputs to conform to specific Pydantic schema objects as well as applies advanced retry techniques if initial parsing of the outputs is not successful.</p></li></ul><p></p><h3>Wrapping up.</h3><p></p><p>Context engineering is an actively evolving practice. We are all learning in the trenches - even though some practices might work today, some of them might become less relevant in the future as the tooling improves, some might stop working due to the changes in behaviours of LLMs. Whatever you think about it, it is the time to participate in defining the best practices and help moving the industry forward.</p><p></p><p>Happy building!</p><p></p><p>Hope you enjoyed this article and hope to see you next week!</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div>]]></content:encoded></item><item><title><![CDATA[Enterprise Agentic AI Hierarchy of Needs]]></title><description><![CDATA[The crucial layers of infrastructure that make up a production grade Agentic AI system.]]></description><link>https://www.newsletter.swirlai.com/p/enterprise-agentic-ai-hierarchy-of</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/enterprise-agentic-ai-hierarchy-of</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Wed, 18 Jun 2025 13:35:05 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/6bfb6908-6268-4528-9e79-7f3f2cbc15c6_3757x3153.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>In today&#8217;s episode I want to share with you one of the frameworks that I have designed. It helps me think about adding complexity to the infrastructure needed to power Agentic Systems that we are building for enterprises. I hope it will help you as well.</p><p></p><p><strong>In the post you will find:</strong></p><ul><li><p>The Cowboy Agentic AI Hierarchy of Needs.</p></li><li><p>The Enterprise Agentic AI Hierarchy of Needs.</p></li><li><p>Different layers of infrastructure needed to power production grade Agentic Systems.</p></li><li><p>Problems that these layers are meant to solve and some notable vendors that are tackling them.</p></li></ul><p></p><p>The framework has evolved throughout the time. I will start with the initial version.</p><p></p><h3>The Cowboy Agentic AI Hierarchy of Needs.</h3><p>I started contemplating the Framework I am about to introduce to you close to when ChatGPT was launched a few years ago. It was inspired by the conversations I had with companies that were building agentic applications back then, on the first available LLM APIs.</p><p>The idea behind the Agentic AI Hierarchy of Needs framework is simple but useful:</p><ul><li><p>While working on Agentic Systems you want to build and ship them fast.</p></li><li><p>What are the minimal requirements for Agentic AI specific tooling and infrastructure you can go away with as you progress in the lifecycle of your product:</p><ul><li><p>POC &#8594; MVP &#8594; Beta &#8594; GA &#8594; &#8230;</p></li></ul></li></ul><p>Here is the image representing the importance of each infrastructure layer I observed back then (it is based on how companies would approach adding additional tooling as they developed the product).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!R0dz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!R0dz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!R0dz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/62fcb9a9-3f32-4b83-b111-eeb6404b7087_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:303244,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F62fcb9a9-3f32-4b83-b111-eeb6404b7087_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!R0dz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!R0dz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa2675724-49e1-47f4-bcd0-0a034ad956f8_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Cowboy Agentic AI Hierarchy of Needs.</figcaption></figure></div><p>Here is a short description of the diagram, we will go into some layers of the infrastructure in more details as we progress through the blog post.</p><h4>Model Layer:</h4><ul><li><p>CPU/GPU hardware: In most cases we are taking this for granted as it is being handled for us builders.</p></li><li><p>Base Infrastructure: The next layer of infrastructure needed to handle LLM deployment and serving.</p></li><li><p>Foundation Models themselves: In most cases provided by the big AI Labs.</p></li></ul><p>We can start prototyping and build the simplest applications and reach POC level by only having this.</p><p>You can read more about the best practices of moving from POC to MVP etc. here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6a0f762a-49b6-44ed-a9ca-4ac549d095f4&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Evaluation Driven Development for Agentic Systems.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-22T13:16:55.745Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4aa1ed38-f60a-4673-a181-c2e31c38551b_4218x3628.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/evaluation-driven-development-for&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:164005201,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:50,&quot;comment_count&quot;:2,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><h4>Application Layer:</h4><p>This is where the most business value is created - continuous development of applications powered by GenAI.</p><p>To reach a reliable <em>MVP</em> we usually needed: </p><ul><li><p>Data Storage.</p></li><li><p>LLM Orchestration.</p></li></ul><p><em>Beta:</em></p><ul><li><p>Model Routing.</p></li><li><p>LLM Observability.</p></li></ul><p><em>GA:</em></p><ul><li><p>LLM Evaluation.</p></li><li><p>LLM Security.</p></li><li><p>AI Agent Memory.</p></li><li><p>AI Agent Communication Protocols.</p></li></ul><p>I am now calling this the <strong>Cowboy Agentic AI Hierarchy of Needs</strong>. This is because back then close to none of the startups I worked with cared about the quality of the application compared to how much emphasis was put on the fact that the application had to work at scale and never go down.</p><p>Of course, everyone wanted VC money flowing in. The applications would crumble due to quality issues, however new customer inflow would still be higher than churn, so that was ok.</p><p></p><h3>The Enterprise Agentic AI Hierarchy of Needs.</h3><p>I still consult startups, but a lot of focus has shifted towards enterprise grade agentic applications in the last year plus. In general, the Cowboy Agentic AI Hierarchy of Needs did not stood the test of time and quality needed for the applications being shipped to production.</p><p>The shift was expected - LLM Observability and Evaluation had to take the central picture, pushing even Orchestration to the less important place in the Hierarchy of Needs.</p><p>Almost in all cases we are setting up the Observability foundations first before adopting any LLM Orchestration framework available in the market. Sometimes the adoption of these frameworks would not happen at all (depending on the complexity of the project/product) as simple wrapper clients like <em>instructor</em> are enough and the chaining can be easier implemented without a dedicated framework.</p><p>Also, I&#8217;ve noticed a trend of organisations dropping LLM Orchestration frameworks with the application they are building becoming more and more complex. This happens because the need for low level control becomes more important which is in most cases hidden away from the user by the frameworks and can become hard to reach.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!07JF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!07JF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!07JF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!07JF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!07JF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!07JF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0abd4f69-a9b2-42e2-b564-bd9ec1ae7c5c_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:318870,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0abd4f69-a9b2-42e2-b564-bd9ec1ae7c5c_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!07JF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!07JF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!07JF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!07JF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4da8530e-ca2e-459f-b748-a4101ab5dd88_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Enterprise Agentic AI Hierarchy of Needs.</figcaption></figure></div><p>Again, you can read more about Evaluation Driven Development shift here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6f0629a9-2fc5-4e2e-aec1-337e54f1cb11&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Evaluation Driven Development for Agentic Systems.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-22T13:16:55.745Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4aa1ed38-f60a-4673-a181-c2e31c38551b_4218x3628.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/evaluation-driven-development-for&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:164005201,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:50,&quot;comment_count&quot;:2,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>The Vendor Landscape.</h3><p>The good thing about AI hype is that so many platforms have been and are still being created to help with operations in each layer of the infrastructure stack. The picture below displays just a handful of more popular ones in the market. I believe it is less that 10% of the players that exist.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!VVqd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!VVqd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!VVqd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d07da114-59f8-47aa-b476-a643cab3a93b_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:534387,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd07da114-59f8-47aa-b476-a643cab3a93b_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!VVqd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!VVqd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5fe424d5-dd76-4b06-9aea-f89354440114_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Vendor Landscape.</figcaption></figure></div><blockquote><p>I am planning to start curating an up to date market map, stay tuned for more info and feel free to reach out if you want to be included and you believe I could miss you. You can reach me at <em>aurimas@swirlai.com.</em></p></blockquote><p>Now, let&#8217;s go deeper into some of the infrastructure layers defined and discuss what problems they are trying to solve.</p><p></p><h3>The Model Layer.</h3><p>The model layer can be split into 3 parts:</p><ul><li><p>GPU/CPU Hardware.</p></li><li><p>Base Infrastructure.</p></li><li><p>Foundation Models.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3Emo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3Emo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3Emo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b811bb77-5f2f-4dde-b71e-4bd51c20d9bc_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:464402,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb811bb77-5f2f-4dde-b71e-4bd51c20d9bc_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3Emo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!3Emo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6355eb39-3db8-4d85-bbe3-1f4cec557dfd_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Model Layer.</figcaption></figure></div><h4>GPU/CPU Hardware.</h4><p><strong>The problem: </strong>Accelerated compute R&amp;D, manufacturing and supply.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>R&amp;D and manufactoring: NVIDIA, Groq, Google, AWS.</p></li><li><p>Supply: NVIDIA, Groq, Google, AWS, Azure, Coreweave.</p></li></ul><p><strong>Extra notes: </strong>interestingly, with the increase in compute requirements for inference (both, regular serving and test time compute) even the big clouds are not capable to keep up with demand.</p><p></p><h4>Base Infrastructure.</h4><p><strong>The problem: </strong>efficient and scalable model deployment on single and multi-node clusters.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>vLLM.</p></li><li><p>Kubernetes.</p></li><li><p>Slurm.</p></li></ul><p><strong>Extra notes: </strong>Vendors providing both Proprietary and Open Model APIs are leveraging this infrastructure for serving the models for the public.</p><p></p><h4>Foundation Models.</h4><p><strong>The problem: </strong>the need for general, task specific and multi-modal models increasingly capable of solving difficult problems with high precision.</p><p><strong>Some notable vendors/frameworks:</strong></p><ul><li><p>OpenAI.</p></li><li><p>Anthropic.</p></li><li><p>Google.</p></li><li><p>Mistral.</p></li><li><p>Open Source community.</p></li></ul><p></p><h3>The Application Layer.</h3><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/enterprise-agentic-ai-hierarchy-of?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/enterprise-agentic-ai-hierarchy-of?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h4>Data Storage.</h4><p>Internal data is considered to be the main differentiator for companies that build Agentic Systems nowadays.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gLND!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gLND!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!gLND!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!gLND!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!gLND!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gLND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c536afe0-0014-40eb-8d25-1c8fcac76d65_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:449404,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc536afe0-0014-40eb-8d25-1c8fcac76d65_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gLND!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!gLND!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!gLND!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!gLND!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F723a518e-4610-4dc2-92a8-7d7dff29dfe9_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Data Storage.</figcaption></figure></div><p><strong>The problem: </strong>Most production ready enterprise Agentic Systems rely on internal context available within the enterprise. There is a requirement for integration with variety of data sources and efficient capabilities for retrieval of this data.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>Qdrant.</p></li><li><p>Weaviate.</p></li><li><p>MongoDB.</p></li></ul><p><strong>Extra notes: </strong>Vector databases are just a piece of the puzzle. Efficient information retrieval systems should be comprised of relational, graph, key value, document and vector databases. It all depends on the use case and as an AI Engineer you should not always default to vector.</p><p></p><h4>Observability and Evaluation.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FGLL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FGLL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FGLL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/33209064-7045-42b8-ad38-6b786046a297_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ab694c99-15a8-4804-9a9b-5adc861eae3c_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:456302,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fab694c99-15a8-4804-9a9b-5adc861eae3c_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FGLL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!FGLL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F33209064-7045-42b8-ad38-6b786046a297_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Observability and Evaluation.</figcaption></figure></div><p><em><strong>Observability.</strong></em></p><p><strong>The problem: </strong>Agentic Systems are non-deterministic and can seem like black boxes from the outside. We need to be able to efficiently trace all of the actions that are happening within the application by instrumenting the code and then perform analytics on the traced data. Also, there is a need for LLMOps practice implementation like prompt versioning. It should also happen on the Observability layer.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>LangSmith.</p></li><li><p>Langfuse.</p></li><li><p>Arize.</p></li></ul><p><strong>Extra notes: </strong>Observability platforms often come with their own instrumentation SDKs. Some Evaluation as well as Versioning capabilities are part of these platforms too.</p><p></p><p><em><strong>Evaluation.</strong></em></p><p><strong>The problem: </strong>Agentic Systems are non-deterministic, there is a need manage exact and non-exact evaluation rules that would be ran against the data produced by the system. It is the only way to make sure that the system you are building and evolving behaves as expected.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>Ragas.</p></li><li><p>Arize.</p></li><li><p>Galileo.</p></li></ul><p><strong>Extra notes: </strong>While the vendors do provide some out of the box evaluations, in most cases you will have to define your own evaluation rules. Also, you can&#8217;t blindly rely on evals defined in different platforms - even though the naming of the eval rules can match, the implementation is most likely different.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Orchestration and Model Routing.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fxMQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fxMQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fxMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42d8f320-8179-462d-a596-7cd8907bfb8b_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:456269,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42d8f320-8179-462d-a596-7cd8907bfb8b_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fxMQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!fxMQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F501b399c-6bbd-4d12-8091-74cfac81093e_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Orchestration and Model Routing.</figcaption></figure></div><p><em><strong>Orchestration.</strong></em></p><p><strong>The problem: </strong>Agentic Systems often take form of complex non-deterministic chains of LLM or other GenAI model calls. There is a need of frameworks that would help developers quickly build these systems and manage the complexity as they are being evolved.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>LangGraph.</p></li><li><p>CrewAI.</p></li><li><p>LlamaIndex.</p></li></ul><p><strong>Extra notes: </strong>Very often<strong> </strong>simple wrapper clients like <em>instructor</em> are enough to start off without the need to adopt any dedicated LLM Orchestration Framework. Also, these frameworks hide some low level implementation details that you would want to tweak to achieve the best performance of your application so it might make sense to drop the framework when your application becomes complex enough. Having said that, frameworks like LangGraph are moving in the right direction by allowing low level tweaks if needed.</p><p></p><p><em><strong>Model Routing.</strong></em></p><p><strong>The problem: </strong>Not all model APIs that you will be using will be stable enough to handle your production traffic. There is a need for routing layer that would allow falling back to a different model provider if there are issues with the main one. </p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>LiteLLM.</p></li><li><p>OrqAI.</p></li><li><p>Portkey.</p></li></ul><p><strong>Extra notes: </strong>properly switching between models is harder that it might look. As an example, a prompt that works for OpenAI might not work well for Claude family of models. Together with the fallback models you should configure fallback prompts. You should also closely tie Routing with Observability.</p><p></p><h4>LLM Security, Agent Memory and Communication Protocols.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RIjB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RIjB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RIjB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png" width="1456" height="903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/de0e584e-ba38-4472-9f5b-530d79f2806d_2765x1714.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:903,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:475769,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165351856?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fde0e584e-ba38-4472-9f5b-530d79f2806d_2765x1714.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RIjB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 424w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 848w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 1272w, https://substackcdn.com/image/fetch/$s_!RIjB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdc8e04c5-a82c-44fd-833c-52c12f06addf_2765x1714.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">LLM Security, Memory and Agent Communication Protocols.</figcaption></figure></div><p><em><strong>Security.</strong></em></p><p><strong>The problem: </strong>Real Agentic Systems have agency over some of our internal systems (e.g. data retrieval, automated ticket creation etc.). Malicious actors can manipulate natural language based interfaces to extract sensitive data or perform unintended actions within your infrastructure. We need safety guardrails that can prevent this and help us identify existing vulnerabilities.</p><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>splxAI.</p></li><li><p>Lakera.</p></li><li><p>WhyLabs.</p></li></ul><p><strong>Extra notes: </strong>LLM Security can be split into multiple categories like:</p><ul><li><p>AI Application Red Teaming - continuous attempt to jailbreak your application.</p></li><li><p>Guardrails - making sure that no unexpected data reaches an LLM or is exposed to the user of your application. E.g. PII data.</p></li></ul><p></p><p><em><strong>Agent Memory.</strong></em></p><p><strong>The problem: </strong>Effective reasoning and planning capabilities of Agentic Systems strongly rely on the actions that the system has already taken as well as on the context available to the organisations internally and externally. We are used to modelling this memory by splitting it into short-term and long-term. There is a need of a layer that helps efficiently manage and retrieve relevant memories on-demand. You can read more about the types of memories in Agentic Systems here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;0f7d65ac-2e1d-43f6-9ae6-3551eeaee771&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory in Agent Systems&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-10-30T10:03:28.773Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7650705-54b4-49a3-91a4-aad0c4093c4b_2926x2198.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/memory-in-agent-systems&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:150888366,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:75,&quot;comment_count&quot;:1,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>mem0.</p></li><li><p>Cognee.</p></li><li><p>Letta.</p></li></ul><p><strong>Extra notes: </strong>These memory layers are not just databases, but rather frameworks of for efficient memory management and retrieval.</p><p></p><p><em><strong>Communication Protocols.</strong></em></p><p><strong>The problem: </strong>As we are entering the era of IoA (Internet of Agents) where AI Agents are distributed over the network and developed by different organisations, we need standards of how the communication between these systems should be handled.</p><p>You can read more about how MCP and A2A protocols work here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;76399901-c468-4f6d-befe-00dc6109dc4d&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;MCP vs. A2A: Friends or Foes?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-13T09:36:31.451Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65e21ee4-ed01-4ff3-967c-e1a57ccbdd41_3037x2606.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/mcp-vs-a2a-friends-or-foes&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:161199380,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:127,&quot;comment_count&quot;:4,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p><strong>Some notable vendors/frameworks: </strong></p><ul><li><p>MCP.</p></li><li><p>A2A.</p></li><li><p>Agentcy.</p></li></ul><p><strong>Extra notes: </strong>Open protocols for Agent communication are important but they are just a piece of the picture, there will be a need for standards that govern all of the existing protocols and other missing pieces. E.g.</p><ul><li><p>How do we standardise tracing and Observability in multi-agent IOA systems?</p></li><li><p>How do we retain the identity of the running job of an AI Agent instance if the communication standard is not unified in different parts of the pipeline?</p></li></ul><p></p><h3><strong>Summary.</strong></h3><ul><li><p>Agentic Systems might look simple from the outside, but it takes a lot of effort to bring them too production reliably.</p></li><li><p>Proliferation of Vendors in the AI space due to the heat of the market is an amazing benefit we get as builders.</p></li><li><p>You should carefully think about the minimal amount of additional infrastructure elements as you move along the maturity of your AI application.</p></li><li><p>Adopting Evaluation Driven development is key. Especially when building for enterprises.</p></li></ul><p></p><p>Hope you enjoyed this article and hope to see you next Wednesday!</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Breaking into AI Engineering in 2025.]]></title><description><![CDATA[A roadmap that will help you up-skill or re-skill into an AI Engineer role.]]></description><link>https://www.newsletter.swirlai.com/p/breaking-into-ai-engineering-in-2025</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/breaking-into-ai-engineering-in-2025</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Wed, 04 Jun 2025 10:21:37 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/db68af97-9e4f-43e7-8c1d-37a79c90e939_7992x6613.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in AI Engineering, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Agentic AI and other buzzwords are emerging almost monthly if not more often. In reality they all describe different variations of Agentic Systems, it might be n agentic workflow or multi-agent system, it&#8217;s just a different topology under the same umbrella.  </p><p>If you are considering a career in AI Engineering in 2025, it might feel overwhelming and that is completely normal.</p><p>But you need to remember - you are not too late to the game. The role as such has only emerged over the past few years and is still rapidly evolving.</p><p>In order to excel in this competitive space, you will need a clear path and focused skills.</p><p>Here is a roadmap you should follow if you want to excel as an AI Engineer in today&#8217;s landscape.</p><p></p><div><hr></div><p>Join me tomorrow (June 5th) in a free live webinar where I will go through how to <strong>Deploy Reliable AI Systems with LLMOps.</strong></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://maven.com/p/8c6863/deploy-reliable-ai-systems-with-llm-ops&quot;,&quot;text&quot;:&quot;Sign up for free&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://maven.com/p/8c6863/deploy-reliable-ai-systems-with-llm-ops"><span>Sign up for free</span></a></p><div><hr></div><p></p><h3>Fundamentals - learn as you go.</h3><p>I have always been a believer that learning fundamentals is key to your career growth. This has not changed. </p><p>However, I have to admit that the game itself has changed with the speed that the industry is moving forward. Staring of with fundamentals before anything else is no longer an option. Hence, you should be continuously learning them as you build out modern AI Engineering skillset.</p><p>Here is a list of concepts and technologies I would be learning and applying in my day-to-day if I were to start fresh.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!QVCf!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!QVCf!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 424w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 848w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 1272w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!QVCf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ecbef042-5405-4d1e-a9af-740ca768f023_3988x2252.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:884077,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fecbef042-5405-4d1e-a9af-740ca768f023_3988x2252.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!QVCf!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 424w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 848w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 1272w, https://substackcdn.com/image/fetch/$s_!QVCf!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F365a14bd-f7b5-448d-8da5-584f7cbcedab_3988x2252.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The Fundamentals.</figcaption></figure></div><p><strong>Python and Bash:</strong></p><ul><li><p>FastAPI - almost all of the backed services implemented in Python are now running as FastAPI servers.</p></li><li><p>Pydantic - the go to framework for data type validation. It is now also a Python standard for implementing structured outputs in LLM based applications.</p></li><li><p>uv - the next generation Python package manager. I haven&#8217;t seen any new projects not using it.</p></li><li><p>git - get your software version control fundamentals right.</p></li><li><p>Asynchronous programming - extremely important in LLM based applications as your Agentic topologies will often benefit from calling multiple LLM APIs asynchronously without blocking.</p></li><li><p>Learn how to wrap your applications into CLI tools that can be then executed as CLI scripts.</p></li></ul><p><strong>Statistics and Machine Learning:</strong></p><ul><li><p>Understand the non-deterministic nature of Statistical models.</p></li><li><p>Types of Machine Learning models - it will help you when LLMs are not the best fit to solve non-deterministic problem.</p></li><li><p>General knowledge in statistics will help you in evaluating LLM based systems.</p></li><li><p>Don&#8217;t get into the trap of thinking that AI Engineering is just Software Engineering with LLMs, some maths and statistics is involved.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6f7a6e05-5d3c-405b-a54e-767e789f96f3&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;What is AI Engineering?&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-11-30T10:36:00.463Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/df15f15b-3871-489b-83f4-7c8c38df9f6f_2709x2402.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:151773190,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:100,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p></p><h3>LLM and GenAI APIs.</h3><p>You should start simple, before picking up any LLM Orchestration Framework begin with native client libraries. The most popular is naturally OpenAI&#8217;s client, but don&#8217;t disregard Google&#8217;s genai library, it is not compatible with OpenAI APIs but you will find use cases for Gemini models for sure.</p><p>So what should you learn?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cdtc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cdtc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 424w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 848w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 1272w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cdtc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/418044da-319b-4d79-859b-aab7c588a4f7_4004x2261.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:904999,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F418044da-319b-4d79-859b-aab7c588a4f7_4004x2261.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cdtc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 424w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 848w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 1272w, https://substackcdn.com/image/fetch/$s_!cdtc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39947c59-dc12-4dba-a6ae-a480f3515f96_4004x2261.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">LLM APIs.</figcaption></figure></div><p><strong>Types of LLMs:</strong></p><ul><li><p>Foundation vs. Fine-tuned.</p></li><li><p>Code, conversational, medical etc.</p></li><li><p>Reasoning Models.</p></li><li><p>Multi-Modal Models.</p></li></ul><p><strong>Structured outputs:</strong></p><ul><li><p>Learn how OpenAI and Claude enforces structured outputs via function calling and tool use.</p></li><li><p>Try out simple abstraction libraries like Instructor - they are enough for most of the use cases and uses pydantic for the structure definition natively.</p></li></ul><p><strong>Prompt Caching:</strong></p><ul><li><p>Learn how KV caching helps in reducing generation latency and costs.</p></li><li><p>Native prompt caching provided by LLM providers.</p></li><li><p>How LLM serving frameworks implement it in their APIs (e.g. vLLM).</p></li></ul><p></p><h3>Model Adaptation.</h3><p>I love the term Model Adaptation. The first time (and maybe the only time) I&#8217;ve seen it in literature was in the book &#8220;<a href="https://www.oreilly.com/library/view/ai-engineering/9781098166298/">AI Engineering</a>&#8221; by <span class="mention-wrap" data-attrs="{&quot;name&quot;:&quot;Chip Huyen&quot;,&quot;id&quot;:4141198,&quot;type&quot;:&quot;user&quot;,&quot;url&quot;:null,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F00c3e330-2634-4bac-8e23-c3cedff65db3_2297x2297.jpeg&quot;,&quot;uuid&quot;:&quot;7dfee428-40eb-4d21-8357-48e620cf1508&quot;}" data-component-name="MentionToDOM"></span>. The term ideally encompasses what we, AI Engineers, do to make LLMs perform actions we expect.</p><p>What should you learn?</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jvR6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jvR6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 424w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 848w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 1272w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jvR6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7fdbc2de-e933-4315-81ed-0e376c91cd01_3990x2253.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:898204,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7fdbc2de-e933-4315-81ed-0e376c91cd01_3990x2253.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jvR6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 424w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 848w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 1272w, https://substackcdn.com/image/fetch/$s_!jvR6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F65deed9a-28d3-4f45-9ecc-3c7782b18bb0_3990x2253.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Model Adaptation.</figcaption></figure></div><p><strong>Prompt Engineering:</strong></p><ul><li><p>Learn the proper prompt structure. It will differ depending on the provider you are using.</p></li><li><p>Understand context size limitations.</p></li><li><p>Prompting techniques like Chain of Thought, Tree of Thought, Few-shot.</p></li><li><p>Advanced prompting techniques: Self-consistency, Reflection, ReAct.</p></li></ul><p><strong>Tool Use:</strong></p><ul><li><p>Tool Use is not magic, learn how it is implemented via context manipulation.</p></li><li><p>Don&#8217;t rush to agents yet, learn how LLMs are augmented with tools first.</p></li><li><p>You might want to pick up a simple LLM Orchestrator Framework at this stage.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;c2df78e9-1369-4697-8a30-afc287ff9b46&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building AI Agents from scratch - Part 1: Tool use&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-12-21T10:30:19.983Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1144abe7-1fb8-4190-b32d-6e59647c858b_2974x2388.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:153433846,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:283,&quot;comment_count&quot;:19,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p><strong>Finetuning:</strong></p><ul><li><p>Learn when it is worth to Finetune vs. just Prompt Engineering or implementing RAG. In most cases it is not worth the effort.</p></li><li><p>Try out tools like Unsloth for quick learning if you do decide to get your hands dirty.</p></li></ul><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Storage and Retrieval.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TYu7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TYu7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 424w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 848w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TYu7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/744b7035-6a7f-46d9-9ead-c5b578e8ea06_3962x2238.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:891537,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744b7035-6a7f-46d9-9ead-c5b578e8ea06_3962x2238.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TYu7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 424w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 848w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 1272w, https://substackcdn.com/image/fetch/$s_!TYu7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf3c4b7-f081-4af2-a562-be4ee371ed4c_3962x2238.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Storage and Retrieval.</figcaption></figure></div><p><strong>Vector Databases:</strong></p><ul><li><p>Learn strengths and weaknesses of vector similarity search.</p></li><li><p>Different types of Vector DB indexes: Flat, IVFFlat, HNSW.</p></li><li><p>When PostgreSQL pgvector is enough.</p></li></ul><p><strong>Graph Databases:</strong></p><ul><li><p>High level understanding about Graph Databases.</p></li><li><p>Don&#8217;t spend too much time here as there is still limited use for Graph DBs even though the promises connected with Graph Retrieval were and still are big.</p></li><li><p>Current challenges still revolve around the cost of data preparation for Graph Databases.</p></li></ul><p><strong>Hybrid retrieval:</strong></p><ul><li><p>Learn how to combine the best from keyword and semantic retrieval to get the most accurate results.</p></li></ul><p></p><h3>RAG and Agentic RAG.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nLrB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nLrB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 424w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 848w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 1272w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nLrB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/968077f9-d409-496c-9354-503820f96531_3958x2236.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/80037cd1-c553-43fb-bbfc-6400ab350642_3958x2236.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:890178,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F80037cd1-c553-43fb-bbfc-6400ab350642_3958x2236.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nLrB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 424w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 848w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 1272w, https://substackcdn.com/image/fetch/$s_!nLrB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F968077f9-d409-496c-9354-503820f96531_3958x2236.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAG and Agentic RAG.</figcaption></figure></div><p><strong>Data Preprocessing:</strong></p><ul><li><p>Learn data clean data before computing Embeddings.</p></li><li><p>Different chunking strategies.</p></li><li><p>Extracting useful metadata to be stored next to the embeddings.</p></li><li><p>Advanced techniques like Contextual Embeddings.</p></li></ul><p><strong>Data Retrieval, Generation and Reranking:</strong></p><ul><li><p>Experiment with amount of data being retrieved.</p></li><li><p>Query rewriting strategies.</p></li><li><p>Prompting for Generation with retrieved Context.</p></li><li><p>Learn how reranking of retrieved results can improve the accuracy of retrieval in your RAG and Agentic RAG systems.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;72fc0c55-8ffb-4480-9a7f-2e388ba0a08e&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The evolution of Modern RAG Architectures.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-07T07:43:33.250Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7430fbad-21da-4918-88cc-3d593254f310_2789x2392.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/the-evolution-of-modern-rag-architectures&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159546301,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:65,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p><strong>MCP:</strong></p><ul><li><p>Agentic RAG is where MCP starts to play a role, you can implement different data sources behind MCP Servers. By doing so you decouple the domain responsibility of the data owner.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;980b9d3f-6144-4b82-bb13-d809b22fb260&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Everything you need to know about MCP.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-03-15T15:16:01.285Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c348e1b-a175-4c65-8ea6-d773f957488e_1934x1554.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/everything-you-need-to-know-about&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159065609,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:103,&quot;comment_count&quot;:6,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p><strong>LLM Orchestration Frameworks:</strong></p><ul><li><p>You don&#8217;t need to rush with choosing Orchestration Framework, most of them hide the low level implementation from you and you would be better off starting out without any Framework whatsoever and using light wrappers like Instructor instead.</p></li><li><p>Once you want to pick up and Orchestrator, I would go for the popular ones because that is what you run into in the real world:</p><ul><li><p>LangChain/LangGraph.</p></li><li><p>CrewAI.</p></li><li><p>LlamaIndex</p></li><li><p>Test out Agent SDKs of Hyper-scalers and AI Labs.</p></li></ul></li></ul><p></p><h3>AI Agents.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!5mAI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!5mAI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 424w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 848w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 1272w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!5mAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1dffe64c-a2e0-42bb-9e71-2d7945a05deb_3970x2242.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:879276,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1dffe64c-a2e0-42bb-9e71-2d7945a05deb_3970x2242.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!5mAI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 424w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 848w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 1272w, https://substackcdn.com/image/fetch/$s_!5mAI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F98943b8a-3d83-43a9-a0fa-a1d75b08d37b_3970x2242.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI Agents.</figcaption></figure></div><p><strong>AI Agent and Multi-Agent Design Patterns:</strong></p><ul><li><p>ReAct.</p></li><li><p>Task Decomposition.</p></li><li><p>Reflexion.</p></li><li><p>Planner-Executor.</p></li><li><p>Critic-Actor.</p></li><li><p>Hierarchical.</p></li><li><p>Collaborative.</p></li><li><p>&#8230;</p></li></ul><p><strong>Memory:</strong></p><ul><li><p>Learn about Long and Short-Term memory in Agentic Systems and how to implement it in real world.</p></li></ul><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;9852d95e-b8aa-4f0d-9704-c693680117c9&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory in Agent Systems&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-10-30T10:03:28.773Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7650705-54b4-49a3-91a4-aad0c4093c4b_2926x2198.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/memory-in-agent-systems&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:150888366,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:72,&quot;comment_count&quot;:1,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><ul><li><p>Try out <em>mem0 </em>- the leading Framework in the industry for managing memory. It now also has an MCP server that you can plug into your agents.</p></li></ul><p><strong>Human in or on the loop:</strong></p><ul><li><p>Learn hoe to delegate certain actions back to humans if the Agent is not capable to solve the problem or the problem is too sensitive.</p></li><li><p>Human in the loop - a human is always responsible for confirming or performing certain actions.</p></li><li><p>Human on the loop - the Agent decides if human intervention is needed.</p></li></ul><p><strong>A2A, ACP, etc.:</strong></p><ul><li><p>Start learning Agent Communication Protocols like A2A by google or ACP by IBM.</p></li><li><p>There are more Protocols popping out each week, but the idea is the same.</p></li><li><p>Internet of Agents is becoming a real thing. Agents are implemented by different companies or teams and they will need to be able to communicate with each other in a distributed fashion.</p></li></ul><p><strong>Agent Orchestration Frameworks:</strong></p><ul><li><p>Put more focus on Agent Orchestration Frameworks defined in the previous section.</p></li></ul><p></p><div><hr></div><p>Master this roadmap together with me in the End-to-End AI Engineering Bootcamp (&#120813;&#120812;% &#120305;&#120310;&#120320;&#120304;&#120316;&#120322;&#120315;&#120321; &#120304;&#120316;&#120305;&#120306;: Kickoff10 )</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p></p><h3>Infrastructure.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!jJBc!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!jJBc!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 424w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 848w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!jJBc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/16ae33a8-11a9-46e3-9ba3-3e412c313f76_3984x2250.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:894119,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F16ae33a8-11a9-46e3-9ba3-3e412c313f76_3984x2250.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!jJBc!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 424w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 848w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 1272w, https://substackcdn.com/image/fetch/$s_!jJBc!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7c992a64-685b-4f16-ac81-b92a475f3834_3984x2250.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Infrastructure.</figcaption></figure></div><p><strong>Kubernetes:</strong></p><ul><li><p>Have at least basic understanding of Docker and Kubernetes.</p></li><li><p>If your current company does not use K8s, it is more likely you will run into the one that does use it rather than the opposite.</p></li></ul><p><strong>Cloud Services: </strong></p><ul><li><p>Each of the major cloud providers have their own set of services meant to help AI builders:</p><ul><li><p>Azure AI Studio.</p></li><li><p>Google Vertex AI.</p></li><li><p>AWS Bedrock.</p></li></ul></li></ul><p><strong>CI/CD:</strong></p><ul><li><p>Learn how to implement Evaluation checks into your CI/CD pipelines.</p></li><li><p>Understand how Unit Eval Tests are different from Regression Eval Tests.</p></li><li><p>Load test your applications.</p></li></ul><p><strong>Model Routing:</strong></p><ul><li><p>Learn how to implement Model fallback strategies to make your </p></li><li><p>Try tools like liteLLM, Orq or Martian.</p></li></ul><p><strong>LLM Deployment:</strong></p><ul><li><p>Learn basics of LLM deployment Frameworks like vLLM.</p></li><li><p>Don&#8217;t focus too much on this as it would be a rare case that you would need to deploy your own models in real world.</p></li></ul><p></p><h3>Observability and Evaluation.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Zg_I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Zg_I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 424w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 848w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 1272w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Zg_I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png" width="1456" height="823" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/818bbcc2-e0e7-4372-b58e-9ef50bd974fe_3981x2249.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:823,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:897812,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F818bbcc2-e0e7-4372-b58e-9ef50bd974fe_3981x2249.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Zg_I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 424w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 848w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 1272w, https://substackcdn.com/image/fetch/$s_!Zg_I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7d00e5a7-cf7b-4d5e-b32d-ef35b96f0d4b_3981x2249.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Observability and Evaluation.</figcaption></figure></div><p><strong>AI Agent Instrumentation:</strong></p><ul><li><p>Learn what SDKs exist for instrumenting Agentic applications, some examples:</p><ul><li><p>Langsmith SDK.</p></li><li><p>Opik SDK.</p></li><li><p>Openllmetry.</p></li><li><p> &#8230;</p></li></ul></li><li><p>Learn Multi-Agent system Instrumentation. How do we connect traces from multiple agents into a single thread.</p></li><li><p>You can also dig deeper into OpenTelemetry because most of the modern LLM Instrumentation SDKs are built on top of it.</p></li></ul><p><strong>Observability Platforms:</strong></p><ul><li><p>There are many Observability platforms available off the shelf, but you nee to learn the fundamentals of LLM Observability:</p><ul><li><p>Traces and Spans.</p></li><li><p>Evaluation datasets.</p></li><li><p>Experimenting with changes to your application.</p></li><li><p>Sampling Traces.</p></li><li><p>Prompt versioning and monitoring.</p></li><li><p>Alerting.</p></li><li><p>Feedback collection.</p></li><li><p>Annotation.</p></li></ul></li></ul><p><strong>Evaluation Techniques:</strong></p><ul><li><p>Understand the costs associated with LLM-as-a-judge based evaluations:</p><ul><li><p>Latency related.</p></li><li><p>Monetary related.</p></li></ul></li><li><p>Know in which step of the pipeline you should be running evaluations to get most out of it. You will not be able to evaluate every run in production due to cost constraints.</p></li><li><p>Learn alternatives to LLM based evaluation:</p><ul><li><p>Rule based.</p></li><li><p>Regex based.</p></li><li><p>Regular Statistical measures.</p></li></ul></li></ul><p>Recently, I wrote a piece on building and evolving your Agentic Systems. The ideas I put out are very tightly connected with being able to Observe and Evaluate your systems as they are being built out. Read more here:</p><div><hr></div><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6c252e09-1fee-412b-a9f3-2bb57b1202b9&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Evaluation Driven Development for Agentic Systems.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-05-22T13:16:55.745Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4aa1ed38-f60a-4673-a181-c2e31c38551b_4218x3628.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/evaluation-driven-development-for&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:164005201,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:44,&quot;comment_count&quot;:2,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><div><hr></div><p></p><h3>Security.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!MhJy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!MhJy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 424w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 848w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 1272w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!MhJy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39392632-7980-4781-ac45-70215fc9d100_3990x2253.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:895157,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39392632-7980-4781-ac45-70215fc9d100_3990x2253.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!MhJy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 424w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 848w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 1272w, https://substackcdn.com/image/fetch/$s_!MhJy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9d5de10-2c0b-407c-972a-df292e1593e1_3990x2253.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Security.</figcaption></figure></div><p><strong>Guardrails:</strong></p><ul><li><p>Learn how to guardrail inputs to and outputs from the LLM calls.</p></li><li><p>Different strategies:</p><ul><li><p>LLM based checks.</p></li><li><p>Deterministic rules (e.g. Regex based).</p></li></ul></li><li><p>Try out tools like GuardrailsAI.</p></li></ul><p><strong>Testing LLM based applications:</strong></p><ul><li><p>Learn how to test the security of your applications.</p></li><li><p>Try to break your own Guardrails and jailbreak from system prompt instructions.</p></li><li><p>Performing advanced Red Teaming to test emerging attack strategies and vectors.</p></li></ul><p></p><h3>Looking Forward.</h3><p>The future development of Agents will be an interesting area to observe. A lot of successful startups are most likely to succeed due to having one of the following:</p><ul><li><p>Distribution.</p></li><li><p>Good UX.</p></li><li><p>Real competitive motes, like physical products. Here is where robotics comes into play.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!RmwQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!RmwQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 424w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 848w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 1272w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!RmwQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e5a03d1-640b-4af8-8e02-0ec8108bf1ce_4005x2262.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:901281,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e5a03d1-640b-4af8-8e02-0ec8108bf1ce_4005x2262.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!RmwQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 424w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 848w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 1272w, https://substackcdn.com/image/fetch/$s_!RmwQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93629f41-6d74-418e-9dd1-17c222218239_4005x2262.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Looking Forward Elements.</figcaption></figure></div><p><strong>Voice, Vision and Robotics:</strong></p><ul><li><p>An interesting blend of capabilities that would allow a physical machine to interact with the world. The areas that I am looking forward to are:</p><ul><li><p>On-device Agents.</p></li><li><p>Extreme Quantisation techniques.</p></li><li><p>Foundation Models tuned specifically for robotics purposes.</p></li></ul></li></ul><p><strong>Automated Prompt Engineering:</strong></p><ul><li><p>New techniques are emerging that allow you to perform automated Prompt Engineering given that you have good test datasets ready for evaluation purposes.</p></li><li><p>Play around with frameworks like DsPy or AdalFlow.</p></li></ul><p></p><h3>Summary.</h3><p>The skillset requirements for AI Engineers are becoming larger every month. The truth is that in your day-to-day you will only need a subset of it.</p><p>You should always start with your immediate challenges and adapt the roadmap accordingly. </p><p>However, don&#8217;t forget to look back and learn the fundamental techniques that power more advanced systems. In many cases these fundamentals are hidden behind layers of abstraction.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0SNH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0SNH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 424w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 848w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 1272w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0SNH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png" width="1456" height="822" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/319a740d-bcb9-468b-bae9-33d51ed89c23_4000x2259.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:822,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:974646,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/165082701?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F319a740d-bcb9-468b-bae9-33d51ed89c23_4000x2259.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0SNH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 424w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 848w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 1272w, https://substackcdn.com/image/fetch/$s_!0SNH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff23fdce8-fc60-4206-9641-cadbc6804f88_4000x2259.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI Engineering Roadmap.</figcaption></figure></div><p>Happy building!</p><p>Hope you enjoyed this article and hope to see you next Wednesday!</p><p></p><div><hr></div><p>Master this roadmap together with me in the End-to-End AI Engineering Bootcamp (&#120813;&#120812;% &#120305;&#120310;&#120320;&#120304;&#120316;&#120322;&#120315;&#120321; &#120304;&#120316;&#120305;&#120306;: Kickoff10 )</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p><div><hr></div>]]></content:encoded></item><item><title><![CDATA[Evaluation Driven Development for Agentic Systems.]]></title><description><![CDATA[My step-by-step approach for building and evolving Agentic Systems that work.]]></description><link>https://www.newsletter.swirlai.com/p/evaluation-driven-development-for</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/evaluation-driven-development-for</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Thu, 22 May 2025 13:16:55 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4aa1ed38-f60a-4673-a181-c2e31c38551b_4218x3628.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><p>SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>I have been developing Agentic Systems for around two years now. The same patterns keep emerging again and again, regardless of what kind of systems are being built.</p><p>I have learned them the hard way and many do so as well. The first project is not a great success, but you learn from the failures and apply the learnings in the next one. Then you iterate.</p><p>Today, I am sharing my system of how to approach development of LLM based applications from idea to production. Use it if you want to avoid painful lessons in your own projects.</p><p>In the Newsletter episode I will cover:</p><ul><li><p>The Evaluation driven application development system.</p></li><li><p>Moving from Prototype to PoC to MVP.</p></li><li><p>Evolving your Agentic Systems.</p></li><li><p>Role of LLMOps in development of AI Agents.</p></li></ul><div><hr></div><p>Before we move forward, I wanted to give a shoutout to MLOps Community and my friend Demetrios for organising amazing events that help move the AI industry forward. A free event that will be happening in San Francisco and I am looking forward - <strong>AI Agent Builders Summit: World Tour Kickoff.</strong></p><p>The event brings together builders, researchers, and doers working on AI Agents in Production in the most lo-fi way possible. &#8203;<strong>Only engineers. No marketers. </strong>A<strong> </strong>great line-up including Anthropic, Boundary ML, Cleric and more.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rJSr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rJSr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rJSr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg" width="562" height="294.3475" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:419,&quot;width&quot;:800,&quot;resizeWidth&quot;:562,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rJSr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 424w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 848w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!rJSr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe2e7d84b-62c2-4d18-abe5-dd2a98e87f7a_800x419.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Highly encourage to register if you are in San Francisco on May 28th.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://lu.ma/agents-world-tour-sf?tk=IdFdLW&quot;,&quot;text&quot;:&quot;Register for Free.&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://lu.ma/agents-world-tour-sf?tk=IdFdLW"><span>Register for Free.</span></a></p><div><hr></div><p></p><h4>AI Product Development Lifecycle.</h4><p>Below is a high level diagram that represents the development lifecycle of a modern AI Product.</p><p>The goal of today is to go step by step and highlight the key considerations you should have in mind if you choose to implement the same path in your projects.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NWec!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NWec!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 424w, https://substackcdn.com/image/fetch/$s_!NWec!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 848w, https://substackcdn.com/image/fetch/$s_!NWec!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!NWec!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NWec!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png" width="1456" height="990" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52b19266-5390-4fe4-a2b5-86a2b7da7450_3081x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:990,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:584549,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52b19266-5390-4fe4-a2b5-86a2b7da7450_3081x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NWec!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 424w, https://substackcdn.com/image/fetch/$s_!NWec!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 848w, https://substackcdn.com/image/fetch/$s_!NWec!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!NWec!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F45eb6d43-b5e2-4cb7-a4ac-a064e7e59c8f_3081x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI Product Development Lifecycle.</figcaption></figure></div><p>Let&#8217;s start!</p><p></p><h3>Defining The Problem.</h3><p>The first step of any Agentic System development is defining the problem you want to solve. Agentic System is one where Large Language Models or other GenAI models are used to solve complex, real-world problems, in many cases dealing with automation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!w-ab!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!w-ab!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 424w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 848w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!w-ab!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png" width="1456" height="991" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/183f9821-df01-420a-9fe3-4547d284365c_3079x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:991,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:133775,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F183f9821-df01-420a-9fe3-4547d284365c_3079x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!w-ab!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 424w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 848w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!w-ab!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F796b9bf7-a846-4555-9f3b-88be387b4cc4_3079x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Defining The Problem.</figcaption></figure></div><p>It's vital to ensure the problem is <strong>clearly defined, bounded</strong> and <strong>aligned with business goals</strong>. Some questions to consider:</p><ul><li><p>Is the problem best solved by AI or traditional software?</p></li><li><p>Who is the end user?</p></li><li><p>What are the edge cases?</p></li><li><p>What are the boundaries of acceptable behaviour?</p></li></ul><p><strong>Roles to involve:</strong> AI Product Managers, Domain Experts, AI Engineers.</p><p><strong>Important: </strong>Many AI projects fail not due to bad models, but due to solving the <em>wrong</em> problem.</p><p></p><h3>Building a Prototype.</h3><p>After you know the problem is a good fit to be solved using AI, start rapid prototyping. In different size organisations this stage can be handled by different people. I am a strong believer of full-stack AI Engineering where AI Engineers prototype and communicate with stakeholders directly, but in large enterprises this is also where AI Product Managers can have a big impact.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!k75a!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!k75a!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 424w, https://substackcdn.com/image/fetch/$s_!k75a!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 848w, https://substackcdn.com/image/fetch/$s_!k75a!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!k75a!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!k75a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png" width="1456" height="999" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/39cf8998-4e5e-4774-9174-2c80e8030975_3054x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:999,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:158503,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F39cf8998-4e5e-4774-9174-2c80e8030975_3054x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!k75a!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 424w, https://substackcdn.com/image/fetch/$s_!k75a!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 848w, https://substackcdn.com/image/fetch/$s_!k75a!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!k75a!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b8cb715-c542-40e3-9483-a3e90663d659_3054x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Building a Prototype.</figcaption></figure></div><p><strong>Some points to consider:</strong></p><ul><li><p>Use Notebooks or no-code tools, small datasets and off-the-shelf models.</p></li><li><p>This is where you learn rather than focus on performance.</p></li><li><p>Document everything so that you don&#8217;t repeat the mistakes in the future.</p></li><li><p>This is where you will do a lot of prompting and researching market for tools that could potentially help in solving the problem (e.g. Voice to Text platforms).</p></li></ul><p><strong>Roles to involve:</strong> AI Product Managers, AI Engineers.</p><p><strong>Important: </strong>This phase can be treated as a de-risking practice. While your idea might look good on the paper, it might not be technically feasible long term.</p><p></p><h3>Defining Performance Metrics.</h3><p>You are never just building a cool application, you are solving a real business problem that needs to be grounded in specific metrics you are planning to optimise.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iQBv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iQBv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 424w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 848w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iQBv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png" width="1456" height="1005" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/63fef130-ad84-482e-8f6b-285ec4452e32_3036x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1005,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:178823,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F63fef130-ad84-482e-8f6b-285ec4452e32_3036x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iQBv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 424w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 848w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!iQBv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F32554b2b-54d9-4dac-b145-50260947a92d_3036x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Defining Performance Metrics.</figcaption></figure></div><p><strong>Key points to consider:</strong></p><ul><li><p>What is that you are trying to optimise for? E.g. reduce headcount, improve user satisfaction, increase development velocity.</p></li><li><p>The above is your north star output metric, now you need to split it into input metrics that can actually drive the output metric forward. E.g. reduce an average time to Customer Support ticket resolution.</p></li><li><p>Your application should be targeting the input metrics, but the output metric will be what really matters for the business.</p></li></ul><p><strong>Roles to involve:</strong> AI Product Managers, AI Engineers, Business Stakeholders.</p><p><strong>Important: </strong>Without setting this stage up properly you risk the project being deprioritised for not showing enough business value. Remember to align with business before you start implementing, it is not enough to simply have it written down on paper.</p><p></p><h3>Defining Evaluation Rules.</h3><p>Metrics for LLMs are notoriously tricky. E.g. human alignment, coherence and factuality. However there are also exact rules that are extremely useful in evaluating your applications.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vlrE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vlrE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 424w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 848w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vlrE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png" width="1456" height="1014" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83d4f044-0f80-4229-8466-f11371ffaca6_3007x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1014,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:190799,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F83d4f044-0f80-4229-8466-f11371ffaca6_3007x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vlrE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 424w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 848w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!vlrE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffcc9885c-0aea-40f2-9fdb-4ab66b5f1643_3007x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Defining Evaluation Rules.</figcaption></figure></div><p><strong>Some points to consider:</strong></p><ul><li><p>Very likely your system will be composed of multiple LLM calls chained together within an Agentic System topology.</p></li><li><p>For any node in your Agentic System topology you should have an evaluation dataset prepared: Inputs &#8594; Expected Outputs.</p></li><li><p>Define unacceptable responses. E.g. toxicity, hallucinations, unsafe suggestions.</p></li></ul><p><strong>Roles to involve:</strong> AI Product Managers, AI Engineers.</p><p></p><div><hr></div><p>I will be teaching how to apply the system described in this blog hands-on and in detail as part of End-to-End AI Engineering Bootcamp (&#120813;&#120812;% &#120305;&#120310;&#120320;&#120304;&#120316;&#120322;&#120315;&#120321; &#120304;&#120316;&#120305;&#120306;: Kickoff10 )</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;Check it out&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://swrlai.com/ai-bootcamp"><span>Check it out</span></a></p><div><hr></div><p></p><h3>Building a PoC.</h3><p>This is a stage that is often misunderstood. You might be driven to transition from prompts to a functioning interface (CLI, chat UI, API, etc.). However, the goal here is pushing out the system to the users as soon as possible. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ZlJa!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ZlJa!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 424w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 848w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ZlJa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png" width="1456" height="1022" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b5883d65-042e-4a3d-8af6-450bb17e954b_2985x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1022,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:328379,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb5883d65-042e-4a3d-8af6-450bb17e954b_2985x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ZlJa!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 424w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 848w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!ZlJa!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6b4e21c-2710-499a-88f1-1e56c6f16252_2985x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Building a PoC.</figcaption></figure></div><p><strong>Points to consider:</strong></p><ul><li><p>Use LLM APIs from OpenAI, Google, Anthropic, X etc. to quickly build out the first user facing application.</p></li><li><p>Your application could be an Excel Spreadsheet with Input Output pairs rather than a full fledged functioning interface. As long as it helps moving metrics forward, it is good to be exposed.</p></li><li><p>The feedback you get from users is key to understand unknown unknowns. In my experience it almost always shifts your perspective of how to improve the application.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers.</p><p><strong>Important:</strong> A successful LLM PoC may look like an Excel Spreadsheet. If you can&#8217;t push it out quick, there is something wrong with the process.</p><p></p><h3>Instrumenting the Application.</h3><p>This is where we implement best practices of Observability in LLM based systems. We do this by instrumenting the application and logging extensive set of metadata about everything that is happening underneath the surface.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!REh5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!REh5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 424w, https://substackcdn.com/image/fetch/$s_!REh5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 848w, https://substackcdn.com/image/fetch/$s_!REh5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!REh5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!REh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png" width="1456" height="1026" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c569fed7-9ca9-447e-a720-4f9b039dce50_2973x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1026,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:342723,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc569fed7-9ca9-447e-a720-4f9b039dce50_2973x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!REh5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 424w, https://substackcdn.com/image/fetch/$s_!REh5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 848w, https://substackcdn.com/image/fetch/$s_!REh5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!REh5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdcf77b03-5b2a-4120-b7f3-3b5305230587_2973x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Instrumenting the Application.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Log everything: prompts, completions, embeddings, latency, token counts, and user feedback.</p></li><li><p>Add additional metadata like: prompt versions, user inputs, model versions used.</p></li><li><p>Make sure that the chains are properly connected and you know the ordering of operations.</p></li><li><p>When working with multimodal data log different kinds of data like PDFs, Image, Audio, Video.</p></li><li><p>Remember that outputs of one LLM call will often become inputs to the next one.</p></li><li><p>Don&#8217;t forget the user feedback! Always attach it to the traces that represent the run users were interacting with when the feedback was provided.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers.</p><p><strong>Important:</strong> This stage is one of the key elements in implementing Evaluation Driven Development.</p><p></p><h3>Integrating with an Observability Platform.</h3><p>It is not enough to just track the data, you need to be able to efficiently visualise and analyse it. This is where Observability platform come into play they help with efficient search and visualisation as well as prompt versioning and adding automated evaluation capabilities.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f93Y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f93Y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 424w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 848w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f93Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png" width="1456" height="1033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/08b6484f-7216-4024-a2a8-62634d15c4c5_2952x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1033,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:328762,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F08b6484f-7216-4024-a2a8-62634d15c4c5_2952x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f93Y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 424w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 848w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!f93Y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b15dca7-12d7-44f5-9200-ac7261ca4400_2952x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Integrating a with an Observability Platform.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Store your Evaluation rules as part of the platform as you will later apply them on the traces.</p></li><li><p>Use these platforms as Prompt Registries as your application is a chain of prompts, you will want to analyse and group the evaluation results by Prompt Groups.</p></li><li><p>Most successful applications reach scale and it becomes too expensive to store all of the traces produced. Observability Platforms have smart sampling algorithms that allow you to store subset of incoming traces.</p></li><li><p>Most of the Observability Platforms come with their own tracing SDKs, use them for seamless Instrumentation.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers.</p><p><strong>Important:</strong> Set this up early as it will bring the visibility to the black box.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Evaluating Traced Data.</h3><p>If you have successfully implemented the previous stages, you will have all of the required pieces to successfully measure your application that has been exposed to the users. Here is where you run Evals on top of the trace data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7dFB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7dFB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 424w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 848w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7dFB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png" width="1456" height="1054" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1054,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:335023,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7dFB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 424w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 848w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!7dFB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3fbfb797-864d-4f8e-8958-525b174c24ab_2893x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Evaluating Traced Data.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Assumption: You have your Evaluation rules stored in the Observability platform together with the incoming traces via your instrumented application and human feedback attached to corresponding traces.</p></li><li><p>Run the Evals automatically on the traces that hit the Observability Platform.</p></li><li><p>Filter out all of the traces that have failing evals or negative human feedback. It is up to you to decide what a failing eval or negative feedback is.</p></li><li><p>We will focus mostly on this &#8220;failing&#8221; data moving on.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers.</p><p><strong>Important:</strong> Running Evals can often be expensive, especially if you are using LLM as a judge tactic in some places. You might want to sample the traces you would be running evals on.</p><p></p><h3>Evolving the Application.</h3><p>You are now ready to improve your application if you have the above figured out.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!hcul!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!hcul!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 424w, https://substackcdn.com/image/fetch/$s_!hcul!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 848w, https://substackcdn.com/image/fetch/$s_!hcul!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!hcul!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!hcul!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png" width="1456" height="1060" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/891ab5b7-b10c-4c5e-8782-768c18f25c54_2879x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1060,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:407588,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F891ab5b7-b10c-4c5e-8782-768c18f25c54_2879x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!hcul!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 424w, https://substackcdn.com/image/fetch/$s_!hcul!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 848w, https://substackcdn.com/image/fetch/$s_!hcul!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!hcul!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1eee3dd0-2f37-487e-b5cc-194b8c551c88_2879x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Developing the Application.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Focus on Failing Evaluations and human feedback to pinpoint where the improvement is needed.</p></li><li><p>If your current topology is not up to the task, make it more complex: Simple Prompts &#8594; RAG &#8594; Agentic RAG &#8594; Agents &#8594; Multi-agent systems.</p></li><li><p>Make the system more complex only if there is a hard requirement, focus on better prompt engineering, data preprocessing, tool integration.</p></li><li><p>You can read more about the evolution of Modern RAG Systems in one of my previous blogs here:</p></li></ul><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;c386f460-2773-4262-9152-c47f7375b7c4&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:&quot;Read full story&quot;,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;The evolution of Modern RAG Architectures.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;Aurimas Griciunas is an AI engineering expert, LinkedIn Top Voice, and the creator of the popular SwirlAI newsletter, trusted by thousands of data and AI professionals.&quot;,&quot;photo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-04-07T07:43:33.250Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7430fbad-21da-4918-88cc-3d593254f310_2789x2392.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/the-evolution-of-modern-rag-architectures&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159546301,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:58,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F1382ba7c-34db-4b33-bbb2-8e23af4b6f7a_1280x1280.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><ul><li><p>A powerful concept in Evaluation Driven Development is having a failing eval dataset that you would never solve for 100%. Your goal is to achieve that but by adding more and more failing samples you never get to that 100%.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers, Domain Experts.</p><p><strong>Important:</strong> Always involve Domain Experts in this stage, they have the insider knowledge about tasks you want to automate, they can even suggest better prompts to solve the problem.</p><p></p><h3>Exposing new version of the Application.</h3><p>This one is quick - expose new versions of the application as fast as possible. The feedback you will be getting on the updates is invaluable.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CX5v!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CX5v!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 424w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 848w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CX5v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png" width="1456" height="1050" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/562e94fa-c730-4dcc-a348-c2a67d9e5492_2904x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1050,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:497979,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F562e94fa-c730-4dcc-a348-c2a67d9e5492_2904x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CX5v!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 424w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 848w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!CX5v!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa26c6970-1797-4bed-9e37-e5333fbaf309_2904x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Exposing new version of the Application.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Deploying new versions fast is important for few reasons:</p><ul><li><p>It improves UX as the present problems get fixed.</p></li><li><p>Some fixes will generalise to unknown problems and you will be solving multiple bugs with one shot.</p></li></ul></li><li><p>Be sure to have strict release tests. You should always have evaluation datasets ready so that you know that what is being released is not worse compared to previous version. Integrate these checks into your CI/CD pipelines.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers, Domain Experts.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/evaluation-driven-development-for?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/evaluation-driven-development-for?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p></p><h3>Continuous Development and Evolution of the Application.</h3><p>The highlighted loop in the image is the key part of the process when it comes to continuous evolution of your application</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pVQz!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pVQz!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 424w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 848w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pVQz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png" width="1456" height="1051" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d929346d-53b8-4e5e-aa52-cf4c65c6c663_2901x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1051,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:510046,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd929346d-53b8-4e5e-aa52-cf4c65c6c663_2901x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pVQz!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 424w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 848w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!pVQz!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3f8451f6-b634-4595-860b-c48ffd91e821_2901x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Continuous Development and Evolution of the Application.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>Remember: Build &#8594; Trace, collect feedback &#8594; Evaluate &#8594; Focus on Failing Evals and Negative Feedback &#8594; Improve the application &#8594; Iterate.</p></li><li><p>As your business requirements become more complex, you might add additional functionality to the application. Very often this would be added as a new route in your Agentic System Topology. E.g. a simple chatbot evolving into a system that can manage your shopping cart automatically on your instruction.</p></li><li><p>In order to add additional functionalities you should follow the process of building a prototype, defining performance metrics and new evals.</p></li><li><p>This is where multiple AI Engineers can start more easily work on a single project as new independent routes can be developed as a separate functionality.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers, Domain Experts, AI Product Managers.</p><p></p><h3>Monitoring and Alerting.</h3><p>After you have implemented all of the tracing end evaluation for development purposes, monitoring almost comes out of the box - you have implemented evaluations and traces for development purposes, they can be reused for monitoring. Configure specific alerting thresholds and enjoy the peace of mind.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Ecn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Ecn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 424w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 848w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Ecn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png" width="1456" height="1059" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5eb95021-0a52-462b-8ba2-cdb1b092b079_2880x2095.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1059,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:477106,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5eb95021-0a52-462b-8ba2-cdb1b092b079_2880x2095.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Ecn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 424w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 848w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 1272w, https://substackcdn.com/image/fetch/$s_!-Ecn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F553541ac-a5a1-4615-8fdd-d5640498e660_2880x2095.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Monitoring and Alerting.</figcaption></figure></div><p><strong>Key considerations:</strong></p><ul><li><p>You already have most of the data that is relevant for LLM specific production monitoring if you have properly implemented application instrumentation.</p></li><li><p>If you haven&#8217;t yet, consider tracing and logging additional advanced metric like TTFT (Time To First Token), inter-token latency etc.</p></li><li><p>You will need to figure out the threshold for alerting.</p></li></ul><p><strong>Roles to involve:</strong> AI Engineers.</p><p><strong>Important: </strong>Try to avoid Alert Fatigue by carefully configuring thresholds that would kick off alerts. Avoid False Positives as much as possible.</p><p></p><h3>Wrapping up.</h3><p></p><ul><li><p>Development of LLM based Agentic Systems is different from traditional software and techniques like Evaluation Driven Development are key for success.</p></li><li><p>Observability and Evaluation is key and should be implemented early in the project lifecycle.</p></li><li><p>Having well defined business success metrics up front will help you avoid reprioritisation of your projects long term and keep the business buy in.</p></li><li><p>Figure out if what you are about to build is feasible early on and drop the idea if it is not. There are too many low hanging fruits with high impact, use your time wisely.</p><p></p></li></ul><p>Hope you enjoyed the writeup and hope to see you in the next one!</p><p></p><div><hr></div><p>I will be teaching how to apply the system described in this blog hands-on and in detail as part of End-to-End AI Engineering Bootcamp (&#120813;&#120812;% &#120305;&#120310;&#120320;&#120304;&#120316;&#120322;&#120315;&#120321; &#120304;&#120316;&#120305;&#120306;: Kickoff10 )</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-bootcamp"><span>AI Engineering Bootcamp</span></a></p><div><hr></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p><p></p><div><hr></div><p>RAISE Summit is approaching and I am proud to be an official ambassador for the event.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!DmIn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!DmIn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!DmIn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png" width="396" height="396" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1200,&quot;width&quot;:1200,&quot;resizeWidth&quot;:396,&quot;bytes&quot;:2347051,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/164005201?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!DmIn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 424w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 848w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 1272w, https://substackcdn.com/image/fetch/$s_!DmIn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F455a1794-57f7-4245-8997-e6ffe5f8206b_1200x1200.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4><strong>Want to Join?</strong></h4><p></p><p>You can win a ticket by filling out this form:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://share.hsforms.com/1ltbFexGYTiuiwsbqM92P9w3n10g&quot;,&quot;text&quot;:&quot;Fill out the Form&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://share.hsforms.com/1ltbFexGYTiuiwsbqM92P9w3n10g"><span>Fill out the Form</span></a></p><p>Or secure your place with <strong>discount code: </strong>amEV73 <strong>for 20% off.</strong></p><div><hr></div><p></p>]]></content:encoded></item><item><title><![CDATA[Announcing the End-to-End AI Engineering Bootcamp.]]></title><description><![CDATA[Also join me this Thursday in a free Lightning Lesson on how to break into AI Engineering.]]></description><link>https://www.newsletter.swirlai.com/p/announcing-the-end-to-end-ai-engineering</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/announcing-the-end-to-end-ai-engineering</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Tue, 13 May 2025 08:44:36 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d245b06d-3f86-401a-ba4a-4e187d5d5dbf_1139x861.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><p>SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>After months of planning, conversations with dozens of engineers, and many late nights building and iterating I finally have some big news!</p><p>I&#8217;m incredibly excited to finally open enrolment for my new cohort-based course: the <strong>End-to-End AI Engineering Bootcamp</strong>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xp3M!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xp3M!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 424w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 848w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 1272w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xp3M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png" width="574" height="505.7980769230769" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1283,&quot;width&quot;:1456,&quot;resizeWidth&quot;:574,&quot;bytes&quot;:405421,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/163433756?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xp3M!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 424w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 848w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 1272w, https://substackcdn.com/image/fetch/$s_!xp3M!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa6a463b9-b13a-4f24-833a-af06bca320d1_1884x1660.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>You can check it out an enrol here:</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-bootcamp&quot;,&quot;text&quot;:&quot;End-to-End AI Engineering Bootcamp&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-bootcamp"><span>End-to-End AI Engineering Bootcamp</span></a></p><p></p><ul><li><p>This course is structured around weekly sprints - just like real AI teams work.</p></li><li><p>Each week, you&#8217;ll scope a feature, build it, test it, and improve it.</p></li><li><p>From RAG pipelines to agent orchestration and deployment, we&#8217;ll move through the same iterative process used by AI engineers shipping real systems.</p></li></ul><p><br><strong>Over 8 weeks, you will:</strong></p><ul><li><p> Build a full-stack AI app (from idea -&gt; deployed).</p></li><li><p>Master RAG, agentic and multi-agent systems.</p></li><li><p>Implement LLMOps best practices: Evaluation, Observability, continuous feedback loops and more.</p></li><li><p>Learn automated prompt engineering.</p></li><li><p>Apply emerging technologies like MCP and A2A.</p></li><li><p>Leave with a robust project you can actually showcase.<br></p></li></ul><p>If you are:</p><ul><li><p>Data Analyst, Data Scientist, ML Engineer, Data Engineer or Software Engineer trying to re-skill and break into AI Engineering.</p></li><li><p>AI Engineer that has some experience in building agentic systems but wants to up-skill.</p></li><li><p>AI Leader that wants to up-skill their teams (encourage them to enrol!).</p></li></ul><p>This bootcamp is for you.</p><p>We kick off June 23rd. Hope to see you there!</p><div><hr></div><h4>Free Lightning Lessons.</h4><p>I will also be hosting a series of free lightning lessons along the way. The first one is happening this Thursday already.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!to1j!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!to1j!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 424w, https://substackcdn.com/image/fetch/$s_!to1j!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 848w, https://substackcdn.com/image/fetch/$s_!to1j!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!to1j!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!to1j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg" width="438" height="400.2225" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:731,&quot;width&quot;:800,&quot;resizeWidth&quot;:438,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;graphical user interface, text&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="graphical user interface, text" title="graphical user interface, text" srcset="https://substackcdn.com/image/fetch/$s_!to1j!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 424w, https://substackcdn.com/image/fetch/$s_!to1j!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 848w, https://substackcdn.com/image/fetch/$s_!to1j!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!to1j!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F88248a35-b74b-4252-a72d-28d81a02ce0e_800x731.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Register here:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://swrlai.com/ai-lightning&quot;,&quot;text&quot;:&quot;Breaking Into AI Engineering&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://swrlai.com/ai-lightning"><span>Breaking Into AI Engineering</span></a></p><p>In this session, I&#8217;ll walk through:</p><ul><li><p>What AI Engineering actually is (and what it&#8217;s not).</p></li><li><p>The end-to-end process of building AI systems - from problem framing to deployment.</p></li><li><p>The specific skills needed at each stage of that process.</p></li><li><p>How to map your current experience into this evolving role.</p></li></ul><p>If you&#8217;re a data scientist, ML engineer, software engineer, or technical team lead - this session is for you.</p><p>You&#8217;ll walk away with a clear mental model, a structured process, and a practical starting point to grow your impact in AI.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p><p><br></p>]]></content:encoded></item><item><title><![CDATA[Distributing FlashAttention: Solving memory Bottlenecks of Context Windows]]></title><description><![CDATA[Let's explore FlashAttention, types of distributed LLM training and how to fuse both concepts together.]]></description><link>https://www.newsletter.swirlai.com/p/distributing-flashattention-solving</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/distributing-flashattention-solving</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sun, 11 May 2025 08:30:21 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/bf39a917-1397-4352-a252-438f08fe5671_1380x1222.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Lately, I&#8217;ve been asked several times for my thoughts on the seemingly infinite context lengths of some of the latest LLMs - and whether this development might bring the end for techniques like Retrieval-Augmented Generation (RAG) in the long (or rather mid) run. My answer is always the same: RAG isn&#8217;t going anywhere soon because million-token context windows are still more an illusion than reality, the problems of &#8220;lost in the middle&#8221; and &#8220; needle in the haystack&#8221; have not been solved yet even though the progress is big.</p><p>This recurring question inspired me to write a piece exploring the technical challenges involved in training LLMs with extended context windows.</p><p>Transformer-based models have revolutionised the field of deep learning. The ability to model long-range dependencies via self-attention is key in todays AI systems. As these models grow in capability we still face a key limitation: <strong>context length</strong>.</p><p>The problem: self-attention scales quadratically with sequence length. This means doubling the number of tokens more than quadruples the computational and memory load. While this cost may be manageable for short sequences, it quickly becomes unsustainable as models are trained on long-form content like legal documents, books, or extended conversations. And to achieve real (accurate) long context inference we need to train on long sequences or apply advanced techniques that help generalise to long sequences but have shown limitations.</p><p>In today&#8217;s episode we will explore:</p><ul><li><p>Memory Bottleneck of the Context Window.</p></li><li><p>What is FlashAttention.</p></li><li><p>Parallelism techniques in distributed LLM training.</p></li><li><p>Combining FlashAttention and distributed training with Kvax.</p></li></ul><p></p><div><hr></div><p>This newsletter episode was made possible by today&#8217;s sponsor, Nebius - a leader in AI Cloud technology.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Psk6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" width="460" height="87.4" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:190,&quot;width&quot;:1000,&quot;resizeWidth&quot;:460,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Online classes - nebius&quot;,&quot;title&quot;:&quot;Online classes - nebius&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Online classes - nebius" title="Online classes - nebius" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Recently Nebius has open sourced Kvax, their FlashAttention implementation based on JAX. Designed for efficient training with long sequences, Kvax supports context parallelism and optimised computation of document masks. It outperforms many other Flash Attention implementations in long-context training with dense packing.</p><p>In the second part of the article we will explore how Kvax supercharges distributed training with long input sequences. I highly recommend checking out the GitHub repository if you are innovating in this area.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/nebius/kvax&quot;,&quot;text&quot;:&quot;Check out Kvax&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/nebius/kvax"><span>Check out Kvax</span></a></p><div><hr></div><p></p><h3>Explaining the Memory Bottleneck of the Context Window.</h3><p>The problem is as old as the Transformer architecture itself. Almost from the very beginning we have tried to stretch the context window of the LLMs to overcome the limitations that make some applications not feasible.</p><p>The problem with extending context windows lies within the architecture of Transformer itself, to be more precise - the Self-Attention mechanism that is the core piece of the architecture. Let&#8217;s look into the original Transformer description from the famous &#8220;Attention is all you need&#8221; paper:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Daa5!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Daa5!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 424w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 848w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 1272w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Daa5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png" width="1295" height="785" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/054f8f6c-486c-451d-9c94-50886a232101_1295x785.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:785,&quot;width&quot;:1295,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:180843,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Daa5!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 424w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 848w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 1272w, https://substackcdn.com/image/fetch/$s_!Daa5!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F054f8f6c-486c-451d-9c94-50886a232101_1295x785.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Transformer architecture. Pieces of the image taken from <a href="https://arxiv.org/pdf/1706.03762">here</a>.</figcaption></figure></div><p>The architecture of the network might seem complex from the first glance, but from the implementations side it is just a long sequence of matrix multiplications. To understand where the memory bottleneck of the Context window happens we would need to zoom into the &#8220;Scaled Dot-Product Attention&#8221; piece of the graph.</p><p>We will not go into the details of each computation step but rather focus on the part that is most affected by the length of the sequence which is forward passed through the network.</p><p>Let&#8217;s say you have a sequence of five tokens &#8220;He does not know what&#8221; (<strong>important:</strong> for simplicity sake we are assuming that each of the words result in a single token after the tokenisation step. This is not always the case and a single word can be divided into multiple tokens). The sequence of length 5 will always be transformed into a matrix of size 5x5 when attention operations are performed.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!K4KK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!K4KK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 424w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 848w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 1272w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!K4KK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png" width="894" height="514" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:514,&quot;width&quot;:894,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:47046,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!K4KK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 424w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 848w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 1272w, https://substackcdn.com/image/fetch/$s_!K4KK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc157386e-3a1d-43b2-9cbe-3466d6e7e774_894x514.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Attention matrix.</figcaption></figure></div><p>And this is where the scalability challenge of self-attention becomes particularly problematic: the infamous quadratic growth in both memory and computational requirements. In the self-attention mechanism, each token in the input sequence attends to every other token, which requires constructing an attention matrix whose size is proportional to the square of the sequence length. Concretely, if the input sequence has a length of <em>n</em>, the attention matrix is of size <em>n x n. </em>As a result, when we double the length of the input - say from 5 to 10 - the size of the attention matrix increases from 25 to 100, leading to a fourfold increase in both memory consumption and the number of computations needed to compute attention scores. This quadratic scaling makes it computationally prohibitive to process long sequences using standard transformers, especially in environments with limited memory or latency constraints.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!bENn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!bENn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 424w, https://substackcdn.com/image/fetch/$s_!bENn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 848w, https://substackcdn.com/image/fetch/$s_!bENn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 1272w, https://substackcdn.com/image/fetch/$s_!bENn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!bENn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png" width="591" height="548" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:548,&quot;width&quot;:591,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:60255,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!bENn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 424w, https://substackcdn.com/image/fetch/$s_!bENn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 848w, https://substackcdn.com/image/fetch/$s_!bENn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 1272w, https://substackcdn.com/image/fetch/$s_!bENn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6b7113e8-0687-4fe8-95c9-7bf8575c2e46_591x548.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Scaling quadratically with sequence length</figcaption></figure></div><h4>Attention Mask.</h4><p>Before we proceed with analysing the techniques designed to address the memory bottleneck of the context window in transformer models, it's important to first introduce the concept of the causal attention mask, as it plays a critical role in many of those solutions.</p><p>A causal attention mask is a type of filtering mechanism applied to the attention matrix during computation. Its purpose is to ensure that each token in the input sequence can only attend to itself and to tokens that come before it - never to future tokens. This is typically implemented by applying a lower triangular matrix (as shown in the image below), where positions representing attention to future tokens are masked out, effectively assigning them zero weight.</p><p>Most LLMs we use today - including models in the GPT family - rely on this form of masked attention. It ensures that during both training and inference, the model predicts the next token based only on the preceding context.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Gyon!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Gyon!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 424w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 848w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 1272w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Gyon!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png" width="591" height="521" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e8870f9c-5839-413c-b672-062427182f0c_591x521.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:521,&quot;width&quot;:591,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:40721,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Gyon!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 424w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 848w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 1272w, https://substackcdn.com/image/fetch/$s_!Gyon!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe8870f9c-5839-413c-b672-062427182f0c_591x521.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Causal Attention Mask</figcaption></figure></div><p>Now let&#8217;s explore one of the most widely accepted way of mitigating some of the memory bottleneck of the context window - FlashAttention.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Enter FlashAttention.</h3><p>There have been multiple iterations of FlashAttention released to date - <a href="https://arxiv.org/abs/2205.14135">v1</a>, <a href="https://arxiv.org/abs/2307.08691">v2</a> and <a href="https://arxiv.org/abs/2407.08608">v3</a>. However, in this article, instead of focusing on specific implementations we will focusing on the core techniques and architectural ideas of FlashAttention which aim to mitigate the memory and compute bottlenecks in standard self-attention:</p><ul><li><p>I/O Awareness &amp; Fused Kernels.</p></li><li><p>Tiling.</p></li><li><p>Dense Packing.</p></li><li><p>Skipping Blocks.</p></li></ul><p>Together, these techniques significantly reduce the quadratic memory and compute cost of vanilla self-attention, making it feasible to train and deploy large models on longer sequences and with greater efficiency. Let&#8217;s explore how.</p><p></p><h4>Fused Kernel.</h4><p>A Kernel in accelerated compute programming is a function or operation you launch on the node to perform a specific task in parallel.</p><p>Modern accelerated compute nodes have a hierarchy of memory:</p><ul><li><p>The main memory (DRAM) which is large but relatively slow to access.</p></li><li><p>High Bandwidth Memory (HBM).</p></li><li><p>Much smaller on-chip memory (like registers, L1 cache, shared memory SRAM) which is very fast.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jp7s!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jp7s!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 424w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 848w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 1272w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jp7s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png" width="505" height="412.3763955342903" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:512,&quot;width&quot;:627,&quot;resizeWidth&quot;:505,&quot;bytes&quot;:88904,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jp7s!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 424w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 848w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 1272w, https://substackcdn.com/image/fetch/$s_!Jp7s!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F69edf29f-abcf-41b1-b0c0-957efe130529_627x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Accelerated compute node memory structure. Source: <a href="https://arxiv.org/abs/2205.14135">link</a></figcaption></figure></div><p>Remembering the computations involved in attention heads, we have multiple sequential matrix multiplications. With the standard attention implementation without I/O awareness we would load the intermediate results back and forth from HBM to SRAM and vice versa incurring unnecessary reads and writes between different layers of memory.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!q8bO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!q8bO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 424w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 848w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 1272w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!q8bO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png" width="961" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:961,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:96038,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!q8bO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 424w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 848w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 1272w, https://substackcdn.com/image/fetch/$s_!q8bO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fac384b16-c4ae-436a-b70d-5c6367c29820_961x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Regular operations without Fused Kernel</figcaption></figure></div><p>There are two main problems with this:</p><ul><li><p><strong>Memory (IO) bottleneck:</strong> copying these big matrixes (especially when sequences are long) to and from slow memory is expensive and quickly becomes the bottleneck.</p></li><li><p><strong>Kernel launch overhead: </strong>each of the attention sub-steps above (matrix multiply, softmax, etc.) would be a separate kernel launch. Launching a GPU kernel has some fixed overhead (like setting up the execution on thousands of threads). If you have to launch multiple kernels one after the other, you pay that overhead every time.</p></li></ul><p>A Fused Kernel is simply combining multiple operations into one kernel. Instead of launching one kernel for the matrix multiplication, another for softmax, and another for the value multiplication, all these steps are <em>fused</em> into a single kernel.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!HQSh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!HQSh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 424w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 848w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 1272w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!HQSh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png" width="961" height="630" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/db7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:630,&quot;width&quot;:961,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:92534,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!HQSh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 424w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 848w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 1272w, https://substackcdn.com/image/fetch/$s_!HQSh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdb7d4065-fdb6-48ed-bbdf-2139e4d41d0b_961x630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Fused Kernel</figcaption></figure></div><p>As shown in the below graph, Fused Kernel with other improvements in FlashAttention reduce the runtime significantly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CFuh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CFuh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 424w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 848w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 1272w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CFuh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png" width="492" height="515.0392749244713" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:693,&quot;width&quot;:662,&quot;resizeWidth&quot;:492,&quot;bytes&quot;:67141,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CFuh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 424w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 848w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 1272w, https://substackcdn.com/image/fetch/$s_!CFuh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5f0bc450-30ca-4e65-b752-b0edb816a511_662x693.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Efficiency gains from FlashAttention in general.</figcaption></figure></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Tiling.</h4><p>Tiling in FlashAttention is an optimisation strategy that partitions large attention matrices into smaller, more manageable sub-blocks which are processed independently. This approach enables efficient use of high-speed memory hierarchies within accelerated compute nodes. By loading each block into fast, local memory and reusing it for multiple operations, tiling minimises the need to access slower, large-scale memory repeatedly.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yzmg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yzmg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 424w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 848w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 1272w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yzmg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png" width="627" height="553" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:553,&quot;width&quot;:627,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:50008,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yzmg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 424w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 848w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 1272w, https://substackcdn.com/image/fetch/$s_!yzmg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11ce8f0b-80ff-4022-b085-8b3cef1dd61a_627x553.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tiling.</figcaption></figure></div><p>Clearly, as the sequence length increases, tiling becomes increasingly critical because it restructures the computation of the attention mechanism in a way that significantly reduces its computational and memory overhead. Traditionally, attention scales quadratically with sequence length, making it prohibitively expensive for long sequences. However, by breaking the attention matrix into smaller tiles and computing attention locally within these blocks tiling effectively transforms the quadratic complexity into a form that approaches linear scalability.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!iE50!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!iE50!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 424w, https://substackcdn.com/image/fetch/$s_!iE50!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 848w, https://substackcdn.com/image/fetch/$s_!iE50!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 1272w, https://substackcdn.com/image/fetch/$s_!iE50!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!iE50!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png" width="1056" height="925" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:925,&quot;width&quot;:1056,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:128622,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!iE50!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 424w, https://substackcdn.com/image/fetch/$s_!iE50!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 848w, https://substackcdn.com/image/fetch/$s_!iE50!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 1272w, https://substackcdn.com/image/fetch/$s_!iE50!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F830869bb-511f-4c17-aeaa-68bba4dd6d7b_1056x925.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Tiling: longer sequence.</figcaption></figure></div><p></p><h4>Skipping Blocks.</h4><p>Remember the causal attention masks? This is where they become important in addressing memory bottlenecks of the context window. In causal attention, the lower triangular part of the attention matrix is masked by setting the corresponding attention weights to zero. These masked values effectively have no impact on the final output, making their computation unnecessary. </p><p>When applying tiling, this property becomes especially beneficial: if an entire tile or block of the attention matrix falls within the masked region, it can be completely skipped during computation. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ShYn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ShYn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 424w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 848w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 1272w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ShYn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png" width="1056" height="925" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:925,&quot;width&quot;:1056,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:183438,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ShYn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 424w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 848w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 1272w, https://substackcdn.com/image/fetch/$s_!ShYn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8222e798-22df-4a39-b441-078d7bbb32cc_1056x925.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Skipping blocks with tiling.</figcaption></figure></div><p></p><h4>Dense Packing.</h4><p>In traditional implementations, attention computation often involves padding sequences to a uniform length, which leads to wasted computation on these padded tokens.</p><p>Imagine two sequences:</p><ul><li><p>&#8220;He does not know what is best for him and his friends.&#8221;</p></li><li><p>&#8220;I am the second sample nice to meet you.&#8221;</p></li></ul><p>This is how the attention matrix would look like for the first sequence if we fix the sequence length at 20. Notice all of the padding tokens required.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Bvd9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Bvd9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Bvd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png" width="1131" height="998" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:138056,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Bvd9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!Bvd9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F11e010bf-3111-4685-852d-2f9441c0b9ef_1131x998.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Padding: sequence 1.</figcaption></figure></div><p>This is how the second sequence would be represented:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sY9n!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sY9n!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sY9n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png" width="1131" height="998" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:133125,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sY9n!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!sY9n!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F744ad45b-fa52-4d3b-93d7-1bd492f415a8_1131x998.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Padding: sequence 2.</figcaption></figure></div><p>Here is where dense packing comes in. The technique allows you to pack multiple sequences into a single batch. Notice how the masked areas are placed. Dense packing addresses this inefficiency by grouping variable-length sequences together in a way that fills memory blocks as completely as possible, minimising idle compute and reducing memory overhead.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7P1t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7P1t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7P1t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png" width="1131" height="998" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:144087,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7P1t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!7P1t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31bc028-89f7-474d-b773-d60a4b3559b4_1131x998.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Dense Packing.</figcaption></figure></div><p>Combined with Tiling, Block Skipping and Kernel Fusion the algorithm achieves maximal efficiency gains.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ry1X!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ry1X!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ry1X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png" width="1131" height="998" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:998,&quot;width&quot;:1131,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:218428,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/162979886?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ry1X!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 424w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 848w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 1272w, https://substackcdn.com/image/fetch/$s_!Ry1X!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae5cb372-4b97-45ef-8b99-861afadc2ca1_1131x998.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Dense Packing and Block Skipping.</figcaption></figure></div><p>In practical terms, FlashAttention can handle much longer sequences on a single accelerated compute node before running out of memory, and it can be up to several times faster than the naive attention implementation. However, FlashAttention by itself does not parallelise across multiple nodes - it&#8217;s about optimising attention on one device.</p><p></p><h3>Parallelism in Distributed LLM Training.</h3><p>To use FlashAttention for extremely long contexts that exceed a single node&#8217;s capability, or to speed up training by using multiple nodes, we need to parallelise the FlashAttention computation across devices. </p><p>Training LLMs often requires splitting work across multiple accelerated compute nodes due to the enormous model sizes and sequence lengths. There are multiple techniques applied to achieve this:</p><ul><li><p><strong>Data parallelism:</strong> the simplest approach - each node gets a different subset of the training data (mini-batch), and a full copy of the model. After each step, gradients are synchronised to keep model replicas in sync. However, data parallelism doesn&#8217;t reduce per-node memory or compute load of the model itself, it just uses more data in parallel.</p></li></ul><p>To distribute the model&#8217;s workload, model parallelism is used. There are generally two forms of model parallelism:</p><ul><li><p><strong>Tensor (intra-layer) parallelism:</strong> splitting the computations <em>within</em> a layer across devices. For example, splitting a large matrix multiplication or the set of attention heads across nodes. In attention, this could mean each node computes a subset of the attention heads independently, then their outputs are combined. This reduces per-node computation, but it has limitations &#8211; e.g. the number of heads might be smaller than the number of available nodes.</p></li><li><p><strong>Pipeline (inter-layer) parallelism:</strong> splitting the model&#8217;s layers among different nodes. Each node holds a consecutive chunk of the network layers. The micro-batch of data flows through the pipeline: first node processes the first few layers, then passes activations to the next node for the next layers, and so on.</p></li></ul><p>In practice, state-of-the-art training combines all types of parallelism. For example, one might use tensor parallelism within each node, pipeline parallelism across nodes, and data parallelism for even higher scale. This is usually done due to minimal network speed considerations required for different parallelism types. </p><p>Another strategy relevant for long sequences is sequence parallelism (also called context parallelism). This involves splitting the input sequence length across nodes. Each node then handles a different chunk of the sequence tokens. The challenge is that certain operations (like self-attention) are not independent across sequence chunks and require communication.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Kvax by Nebius.</h3><p>Kvax makes FlashAttention <em>&#8220;distributed-friendly&#8221;</em> through several key techniques:</p><ul><li><p>Context (Sequence) Parallelism with All-Gather: It splits the sequence across nodes and uses an all-gather of Key/Value tensors so each node obtains the full context before computing attention. This allows each node to run the FlashAttention algorithm on its local queries with all necessary data, just as if it were a single-device run.</p></li><li><p><strong>Grouped Query Attention (GQA):</strong> By using GQA (grouping heads to share K/V), the size of keys and values to communicate is reduced. This keeps the all-gather overhead very low relative to computation, making scaling efficient. The attention computation remains the dominant cost, so multiple nodes can collaborate with minimal slowdown.</p></li><li><p><strong>Fused high-performance kernels:</strong> Kvax&#8217;s implementation in Triton ensures that on each node the FlashAttention computation (and mask application) is done in a tiled, memory-efficient manner, just like single-node FlashAttention. It also splits and optimises the backward pass, using all-reduce to accumulate gradients, so multi-node training has the same mathematical correctness with minimal overhead.</p></li><li><p><strong>Support for complex masks:</strong> The approach naturally accommodates causal masks and packed-sequence (document) masks by computing masks on the fly and by virtue of each node having all keys/values (making mask logic simpler). This is crucial for real training scenarios where you concatenate multiple sequences or need attention window constraints.</p></li><li><p><strong>Load balancing across sequence chunks:</strong> Kvax employs token shuffling across chunks (per LLaMA3&#8217;s recipe) to ensure each node does a similar amount of work. This prevents any single shard from becoming a hotspot (for example, the node handling the last part of a long sequence isn&#8217;t overwhelmed by having to attend to all preceding tokens).</p></li><li><p><strong>Combined parallelism:</strong> Kvax doesn&#8217;t force you to choose one parallelism over another - it supports using data parallelism simultaneously with model parallelism. For instance, you could use tensor parallelism to split heads across GPUs and use context parallelism to split a very long sequence, in conjunction with data parallel across multiple such groups. This flexibility means Kvax can fit into many multi-dimensional parallel setups for large-scale training.</p></li></ul><p>In essence, Kvax enables large-scale training of transformers with long contexts by efficiently distributing the attention computation.</p><p>Kvax demonstrates that FlashAttention can indeed be scaled across multiple devices without losing its efficiency edge. By addressing communication overhead and preserving memory-optimal computation, it allows training of LLMs with extremely long sequences (potentially 10k-100k tokens) on clusters of nodes while also maintaining high throughput.</p><p>Don&#8217;t forget to check out the open source library:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/nebius/kvax&quot;,&quot;text&quot;:&quot;Check out Kvax&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/nebius/kvax"><span>Check out Kvax</span></a></p><p></p><h3><strong>Wrapping up.</strong></h3><p>I intentionally kept this article high level enough to make it easy to skim through and build understanding of the difficulties of training large context window LLMs. Here are some general conclusions that I came to while writing this piece (some of the supporting facts are not part of the newsletter):</p><ul><li><p>We have moved quite far in solving the Memory Bottleneck of the Context Window in the past few years after the emergence of LLMs.</p></li><li><p>FlashAttention is the standard technique to solve some of the Bottleneck problems and is included as a default in most of the training libraries.</p></li><li><p>The distributed training while using FlashAttention is not trivial. While it is being internally solved in big AI Labs, companies like Nebius are doing big favour to the community by open sourcing their implementations that help in distributed training.</p></li><li><p>There are much research on the way that is targeting ability of LLMs generalising to longer sequences even when trained on shorter sequences. This in combination with Distributed FlashAttention might allow truly long context windows that actually work.</p></li><li><p>The problems of &#8220;Lost in the middle&#8221; and &#8220;Needle in the haystack&#8221; have not been solved yet, but it seems it could be in the future.</p></li></ul><p>Hope you enjoyed the writeup and hope to see you in the next one!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[MCP vs. A2A: Friends or Foes?]]></title><description><![CDATA[Could MCP fade into irrelevance in long term?]]></description><link>https://www.newsletter.swirlai.com/p/mcp-vs-a2a-friends-or-foes</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/mcp-vs-a2a-friends-or-foes</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sun, 13 Apr 2025 09:36:31 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/65e21ee4-ed01-4ff3-967c-e1a57ccbdd41_3037x2606.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>The success of MCP (Model Context Protocol) by Anthropic has clearly inspired other players in AI industry to join the race to be the ones to define open protocols that will be used in Agentic Systems integration.</p><p>This week Google comes out publicly with their open protocol called A2A (Agent2Agent) that is aiming to normalise how multi-Agent systems communication is implemented. Many are saying (misinterpreting?) that the two protocols are competitive rather than complimentary.</p><p>The public stance by Google is that A2A is complimentary to MCP. That is a reasonable statement. However, could there be hidden long term competitive goals? Will we see protocol wars start soon? </p><p>I have been asked multiple time how I believe the two protocols might become competitive in the future. Read until the end to get my thoughts on it.</p><p>In this Newsletter episode:</p><ul><li><p>What is A2A?</p></li><li><p>What is MCP?</p></li><li><p>How is A2A complimentary to MCP and vice versa?</p></li><li><p>Could A2A eat up MCP long term?</p></li></ul><p></p><h3>A2A explained.</h3><p>Let&#8217;s first define A2A as it is the new kid on the block (MCP already had its spotlight).</p><h4>The problem.</h4><p>It is becoming clear that Agentic Systems of the future will be multi-Agent. Even more, Agents will be collaborating between each other remotely, each of them potentially implemented using different Agent Frameworks (e.g. LangGraph, CrewAI, Agent Development Kit etc.).</p><p>There are few inherent problems with this:</p><ul><li><p>System State transfer and exchange between systems implemented in different frameworks is not supported.</p></li><li><p>Transfer of the System State between remote Agents is also not possible.</p></li><li><p>Disconnected Agents do not share Tools, Context and Memory (including System State).</p></li></ul><p></p><h4>The solution.</h4><blockquote><p>A2A is an open protocol that provides a standard way for agents to collaborate with each other, regardless of the underlying framework or vendor.</p></blockquote><p></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IHXs!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IHXs!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 424w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 848w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 1272w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IHXs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png" width="588" height="472.90384615384613" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c3af25aa-c748-4db6-880b-e000004b2f50_1620x1303.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1171,&quot;width&quot;:1456,&quot;resizeWidth&quot;:588,&quot;bytes&quot;:132105,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/161199380?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc3af25aa-c748-4db6-880b-e000004b2f50_1620x1303.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IHXs!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 424w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 848w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 1272w, https://substackcdn.com/image/fetch/$s_!IHXs!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d78e7ba-affe-4ac5-9e67-76c29d2ccee4_1620x1303.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A2A Protocol</figcaption></figure></div><p>As per official Google documentation:</p><blockquote><p>A2A protocol facilitates communication between &#8220;client&#8221; and &#8220;remote&#8221; agent.</p></blockquote><p>In simple terms, &#8220;client&#8221; agent creates tasks and communicates them to the &#8220;remote&#8221; agent expecting some work to be performed or data returned.</p><p>Main A2A capabilities facilitating this:</p><ul><li><p>Capability discovery - all agents implementing A2A expose their capability catalogue via &#8220;Agent Card&#8221;. It helps other Agents discover potentially useful features implemented by the given Agent.</p></li><li><p>Task management - communication protocol that facilitates short and long-running tasks. It helps communicating Agents to stay in-sync until the requested task is completed and the answer returned. This is big because some Agents might take long time to execute their work and there are no standards on how to wait for this to happen.</p></li><li><p>Collaboration - Agents can send each other messages to communicate context, replies, artifacts, or user instructions.</p></li><li><p>User experience negotiation - this one is pretty interesting. It allows negotiating on the format the data should be returned in order to fit the user UI expectations (e.g. Image, video, text etc.). </p></li></ul><p>Discovery of agents exposed via A2A is a big topic. Google suggests a unified place to store organisations &#8220;Agent Cards&#8221;. E.g.:</p><pre><code>https://&lt;DOMAIN&gt;/&lt;agreed-path&gt;/agent.json</code></pre><p>This is not unexpected as Google would then be in the best spot to index all of the available Agents world wide potentially creating a global Agent Catalogue similar to current search index.</p><p>There has been a lot of talk about headless browsers and how the future internet of Agents will be implemented. The above is one of the ways it could happen by leveraging already existing technologies.</p><p>I love how A2A stresses the need to not reinvent the wheel and is building on existing standards:</p><ul><li><p>The protocol is built on top of existing, popular standards including HTTP, SSE, JSON-RPC, which means it&#8217;s easier to integrate with existing IT stacks businesses already use daily.</p></li></ul><ul><li><p>Secure by default - A2A is designed to support enterprise-grade authentication and authorization, with parity to OpenAPI&#8217;s authentication schemes.</p><p></p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>MCP Explained.</h3><p>MCP (Model Context Protocol) as defined by Anthropic is:</p><blockquote><p>An open protocol that standardises how applications provide context to LLMs.</p></blockquote><p>To be more precise it attempts to standardise the protocol on how LLM based applications integrate with other environments.</p><p>In Agentic systems the context can be provided in multiple ways:</p><ul><li><p>External data - this is part of long term memory.</p></li><li><p>Tools - the capability of the system to interact with the environment.</p></li><li><p>Dynamic Prompts - that can be injected as part of the system prompt.</p></li><li><p>&#8230;</p></li></ul><h4>Why the need to standardise?</h4><p>Current development flow of Agentic applications is chaotic:</p><ul><li><p>There are many Agent frameworks with slight differences. While it is encouraging to see the ecosystem flourish, these slight difference rarely add enough value but potentially significantly change the way you write code.</p></li><li><p>Integrations with external data sources are usually implemented ad-hoc and using different protocols even within organisations. That is clearly true for different companies as well.</p></li><li><p>Tools are defined in code repositories in slightly different ways. How you attach tools to augmented LLMs is different as well.</p></li></ul><p></p><p>The goal is to improve the velocity of how fast we can innovate with Agentic applications, how well we can secure them and how easy it is to bring relevant data to the context.</p><p>Bellow is the high level architecture of MCP.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!528O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!528O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 424w, https://substackcdn.com/image/fetch/$s_!528O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 848w, https://substackcdn.com/image/fetch/$s_!528O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1272w, https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png" width="1219" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa77772f-3a47-417a-9780-c6942544f7db_1219x684.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5aa88a15-92ce-47f6-bf2a-265ac8f970ad_1219x684.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1219,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102333,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa88a15-92ce-47f6-bf2a-265ac8f970ad_1219x684.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!528O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 424w, https://substackcdn.com/image/fetch/$s_!528O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 848w, https://substackcdn.com/image/fetch/$s_!528O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1272w, https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>MCP Host - Programs using LLMs at the core that want to access data through MCP.</p></li><li><p>MCP Client - Clients that maintain 1:1 connections with servers.</p></li><li><p>MCP Server - Lightweight programs that each expose specific capabilities through the standardised Model Context Protocol.</p></li><li><p>Local Data Sources - Your computer&#8217;s files, databases, and services that MCP servers can securely access.</p></li><li><p>Remote Data Sources - External systems available over the internet (e.g., through APIs) that MCP servers can connect to.</p><p></p></li></ol><h4>Splitting control responsibilities through MCP.</h4><p>MCP Servers expose three main elements that are purposely built in a way that helps implement specific control segregation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UVok!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UVok!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 424w, https://substackcdn.com/image/fetch/$s_!UVok!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 848w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1272w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png" width="539" height="385.9207772795217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9771f5e6-063d-46f2-bcff-e19f2a36bcdc_669x479.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:669,&quot;resizeWidth&quot;:539,&quot;bytes&quot;:42132,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9771f5e6-063d-46f2-bcff-e19f2a36bcdc_669x479.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!UVok!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 424w, https://substackcdn.com/image/fetch/$s_!UVok!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 848w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1272w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Prompts are designed to be User-Controlled.</p><ul><li><p>Programmer of the server can expose specific prompts (suited for interaction with data exposed by the server) that can be injected into the application using LLMs and exposed to the user of the given application.</p></li></ul></li><li><p>Resources are designed to be Application-Controlled.</p><ul><li><p>Resources are any kind of data (text or binary) that can be used by the application built to utilise LLMs. The programmer of the application (usually AI Engineer) is responsible of codifying how this information should be used by the application. Usually, there is not automation in that and LLM does not participate in this choice.</p></li></ul></li><li><p>Tools are designed to be Model-Controlled.</p><ul><li><p>If we provide agency to our application of how it should interact with the environment we use tools to do that. MCP Server exposes an endpoint that can list all of the tools available with their descriptions and required arguments, application can pass this list to the LLM so that it can decide which tools are needed for the task at hand and how they should be invoked.</p></li></ul></li></ul><p></p><h3>A2A + MCP.</h3><p>As per official Google stance:</p><blockquote><p>Agentic applications need both A2A and MCP. We recommend MCP for tools and A2A for agents.</p></blockquote><p>What does it mean? Let&#8217;s look into an Agentic System architecture that involves multiple Agents.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ldoM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ldoM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 424w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 848w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 1272w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ldoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png" width="632" height="579.0439560439561" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/78db48ba-6ab9-4410-8aea-88108f86f53c_2023x1854.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1334,&quot;width&quot;:1456,&quot;resizeWidth&quot;:632,&quot;bytes&quot;:367166,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/161199380?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F78db48ba-6ab9-4410-8aea-88108f86f53c_2023x1854.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ldoM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 424w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 848w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 1272w, https://substackcdn.com/image/fetch/$s_!ldoM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc280dcb8-4359-42d2-bcfb-557faa4884c7_2023x1854.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">A2A + MCP</figcaption></figure></div><p><em>Moving pieces in MCP:</em></p><ol><li><p>MCP Host - This is where it gets interesting, when combined with A2A, MCP Host is the Agent.</p></li><li><p>MCP Client.</p></li><li><p>MCP Server.</p></li><li><p>Local Data Sources.</p></li><li><p>Remote Data Sources.</p></li></ol><p><em>A2A:</em></p><ol start="6"><li><p>Agents (MCP Hosts) would implement and communicate via A2A protocol, that enables:</p><ol><li><p>Secure Collaboration - MCP lacks authentication. [<strong>Update:</strong> recently, there has been many improvements to authentication in MCP: <a href="https://github.com/modelcontextprotocol/modelcontextprotocol/pull/284">link</a>].</p></li><li><p>Task and State Management.</p></li><li><p>User Experience Negotiation.</p></li><li><p>Capability discovery - similar to MCP tools.</p></li></ol></li></ol><p>The suggestion is that MCP is used mostly for integrating legacy data systems (MCP Resources) and APIs (MCP Tools) with LLM based applications while A2A takes care of inter-Agent communication.</p><p>I do believe that as we move forward it will become more common to expose your platforms as Agents rather than MCP Servers so the importance of MCP in <em>point 5.</em> will gradually decrease.</p><p></p><h4>Agent discovery via MCP.</h4><p>Google goes as far as suggesting exposing A2A Agents via MCP server resources.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CZ0z!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CZ0z!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 424w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 848w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 1272w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CZ0z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png" width="1456" height="764" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a9daa0a3-af55-41d0-8cbc-9ef8e387f7e2_1623x852.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:764,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:127852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/161199380?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa9daa0a3-af55-41d0-8cbc-9ef8e387f7e2_1623x852.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CZ0z!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 424w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 848w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 1272w, https://substackcdn.com/image/fetch/$s_!CZ0z!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1dbff19-85c8-4e23-a5e9-0b2da42eb11b_1623x852.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agents as MCP Resources</figcaption></figure></div><ol><li><p>Each Agent in the mesh would be able to discover other available Agents by connecting to a dedicated MCP Server via a MCP Client and browsing the resource catalogue. The suggestion is to expose Agent Cards through these MCP resources.</p></li><li><p>Once discovered, Agents would continue communication between each other utilising A2A protocol.</p></li></ol><p>Having said this, if we move towards Agent discovery via a global index, the importance of MCP here would also decrease or it could even disappear.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Could A2A eat up MCP long term?</h3><p>So is MCP at risk of fading into irrelevance?</p><p>For quite some time already there was a clear need for a medium that would connect swarms of Agents unleashed to the world between each other and to other legacy systems. There were talks about headless browsers, but it does seem like these open communication protocols will actually be the way forward. I believe that this is also where the talks about MCP being the new http moment (over-exaggerated?) were coming from.</p><p>The following thoughts are based on some assumptions:</p><ul><li><p>Open communication protocols will integrate the Agents of the new world.</p></li><li><p>There is benefit in being behind the leading protocol.</p></li><li><p>Both protocols will continue to evolve and potentially expand the boundaries of responsibility.</p><p></p></li></ul><h4>Similarities in MCP and A2A.</h4><p>There are some clear similarities in bots protocols and a user could choose how to model their Agentic applications and expose them to the world in multiple ways.</p><p>With MCP skyrocketing in popularity it is becoming a norm for companies to ship MCP servers as part of their offerings so that developers can seamlessly integrate context from these platform into their own LLM based applications.</p><p>However, MCP is facing some issues in adoption. </p><ul><li><p>One of the biggest downsides of the protocol is the lack in security and authentication. You do need to tinker around the basic implementation to expose a remote MCP server securely. </p></li><li><p>Tools can describe anything, including other Agents. Unfortunately, MCP does not implement any primitives that facilitate proper communication between Agents via tools (state/context exchange, long-running task support etc).</p></li></ul><p>This is where Google might have found a wedge to enter the protocol wars with A2A by fixing the above.</p><p>I can not shake off the feeling that Anthropic had bigger plans for MCP than it&#8217;s current state. Including the interconnection of multiple Agents together. Now, the door to expand into this are might be closed due to emergence of A2A.</p><p></p><h4>What is really important long term?</h4><p>If we think long term, how will the Agentic world really be modelled?</p><ul><li><p>Companies expose their data assets to be used by Agents.</p></li><li><p>Companies expose Agents that can return data or perform actions.</p></li><li><p>Companies ARE Agents that other Agents can interact with.</p></li></ul><p>I am betting on the movement towards the last option.</p><p>If the assumption holds, the real power is in the hands of the protocol that controls remote inter-Agent communication protocol.</p><p>Even short term, assuming the second point would be a default option for newly emerging companies, if one chooses to expose their data via an Agent, A2A is a clear winner. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!ARiD!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!ARiD!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 424w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 848w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 1272w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!ARiD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png" width="484" height="470.3708791208791" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1276a12-8169-4e4f-b866-bedd122c099d_1757x1707.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1415,&quot;width&quot;:1456,&quot;resizeWidth&quot;:484,&quot;bytes&quot;:364981,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/161199380?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1276a12-8169-4e4f-b866-bedd122c099d_1757x1707.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!ARiD!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 424w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 848w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 1272w, https://substackcdn.com/image/fetch/$s_!ARiD!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8778fa6b-9c22-4613-83f4-3fa38323b421_1757x1707.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agent Networks connected by A2A</figcaption></figure></div><p>Having said this, will MCP remain as the protocol to stitch new kinds of applications with legacy systems and will fade into irrelevance once Agents take over? Who knows, lets wait and see.</p><p>But if that happens, guess which player in the industry I am betting on :)</p><p></p><h3>Wrapping up.</h3><p>We are living in exciting times. The way how the new type of Agentic application will be connected at scale is being defined before our eyes. </p><p>A2A may be the new kid on the block, but it&#8217;s quickly shaping up to be the leader in inter-Agent communication. While MCP brought structure to how LLMs integrate context, A2A is solving for what MCP lacks: security, state management, and real-time collaboration. Will A2A eat MCP? Who knows.</p><p>While official stance is that the two protocols are solving completely different problems, there are potential overlaps and one could expect the protocols to also be expanded in scope.</p><p>If the future is Agentic, and companies begin exposing Agents rather than just tools or data, the protocol that enables seamless inter-Agent interaction might just be the winner. Right now, it seems like A2A might be choosing the right moves.</p><p>This is it for today, hope to see you in the next episode!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[The evolution of Modern RAG Architectures.]]></title><description><![CDATA[Learn how to choose the best RAG architecture for your business case.]]></description><link>https://www.newsletter.swirlai.com/p/the-evolution-of-modern-rag-architectures</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/the-evolution-of-modern-rag-architectures</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Mon, 07 Apr 2025 07:43:33 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/7430fbad-21da-4918-88cc-3d593254f310_2789x2392.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://swrlai.com/4j6NETi">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>RAG (Retrieval Augmented Generation) based systems have been, and continue to be, one of the the most useful applications utilising LLMs in enterprises. I remember writing the first post about RAG almost 2 years ago when the term itself was not yet widely adopted.</p><p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;6fbc6d39-f4d1-4e9c-bdf5-cdb3dc75c6f0&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in Data Engineering, MLOps, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;SAI Notes #08: LLM based Chatbots to query your Private Knowledge Base.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2023-06-25T06:40:09.902Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcad6071b-8d2f-4253-8d4e-27b5f7536917_1903x2270.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/sai-notes-08-llm-based-chatbots-to&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:130699827,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:63,&quot;comment_count&quot;:9,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><p>What I described back then was a RAG system implemented in the most naive way possible. The industry has evolved far since then by introducing different kinds of advanced techniques to the process. </p><p>In this Newsletter episode we will go through the evolution of RAG, from Naive to Agentic. After reading through you will understand what challenges have been tackled by each step in evolution.</p><p></p><h3>The emergence of Naive RAG.</h3><p>Naive RAG has emerged almost at the same time as the LLMs have become mainstream with introduction of ChatGPT at the end of 2022. The Retrieval Augmented Generation technique was brought to life in order to solve for issues that native LLMs faced. In short:</p><ul><li><p>Hallucinations.</p></li><li><p>Limited Context window size.</p></li><li><p>Lack of access to non-public data.</p></li><li><p>Parametric knowledge limited to the latest data the model was trained on.</p><p> </p></li></ul><p>The simplest implementation of RAG can be defined in the following steps:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!J6sw!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!J6sw!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 424w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 848w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 1272w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!J6sw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png" width="592" height="600.5384615384615" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c4a3120e-abb2-4ae6-b3d9-42cb2ed160ce_1903x1930.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1477,&quot;width&quot;:1456,&quot;resizeWidth&quot;:592,&quot;bytes&quot;:396943,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc4a3120e-abb2-4ae6-b3d9-42cb2ed160ce_1903x1930.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!J6sw!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 424w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 848w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 1272w, https://substackcdn.com/image/fetch/$s_!J6sw!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa53917f2-6328-4645-b6c9-b634e4c00d75_1903x1930.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Naive RAG.</figcaption></figure></div><p><em>Preprocessing:</em></p><ol><li><p>Split text corpus of the entire knowledge base into chunks - a chunk will represent a single piece of context available to be queried. Data of interest can be from multiple sources, e.g. Documentation in Confluence supplemented by PDF reports.</p></li><li><p>Use the Embedding Model to transform each of the chunks into a vector embedding.</p></li><li><p>Store all vector embeddings in a Vector Database. Save text that represents each of the embeddings separately together with the pointer to the embedding.</p></li></ol><p><em>Retrieval:</em></p><ol start="4"><li><p>Embed a question/query you want to ask using the same Embedding Model that was used to embed the knowledge base itself.</p></li><li><p>Use the resulting Vector Embedding to run a query against the index in the Vector Database. Choose how many vectors you want to retrieve from the Vector Database - it will equal the amount of context you will be retrieving and eventually using for answering the query question.</p></li><li><p>Vector DB performs an Approximate Nearest Neighbour (ANN) search for the provided vector embedding against the index and returns previously chosen amount of context vectors. The procedure returns vectors that are most similar in a given Embedding/Latent space. Map the returned Vector Embeddings to the text chunks that represent them.</p></li><li><p>Pass a question together with the retrieved context text chunks to the LLM via prompt. Instruct the LLM to only use the provided context to answer the given question. This does not mean that no Prompt Engineering will be needed - you will want to ensure that the answers returned by LLM fall into expected boundaries, e.g. if there is no data in the retrieved context that could be used make sure that no made up answer is provided.</p></li></ol><p></p><h3>Moving pieces in the Naive RAG system.</h3><p>Even without any advanced techniques, there are many moving pieces to consider when building a production Grade RAG system.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Jdmr!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Jdmr!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 424w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 848w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 1272w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Jdmr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png" width="598" height="540.0892857142857" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/caa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50e5dafb-5061-4d94-bba1-c89627e1aaed_2126x1920.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1315,&quot;width&quot;:1456,&quot;resizeWidth&quot;:598,&quot;bytes&quot;:506221,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50e5dafb-5061-4d94-bba1-c89627e1aaed_2126x1920.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Jdmr!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 424w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 848w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 1272w, https://substackcdn.com/image/fetch/$s_!Jdmr!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcaa882ac-d4c9-43b5-bf9a-1d0ce50e5d5c_2126x1920.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAG - moving pieces.</figcaption></figure></div><p><strong>Retrieval:</strong></p><p><em>F) Chunking</em> - how do you chunk the data that you will use for external context.</p><ul><li><p>Small, Large chunks.</p></li><li><p>Sliding or tumbling window for chunking.</p></li><li><p>Retrieve parent or linked chunks when searching or just use originally retrieved data.</p></li></ul><p><em>C)</em> Choosing the embedding model to embed and query and external context to/from the latent space. Considering Contextual embeddings.</p><p><em>D) </em>Vector Database.</p><ul><li><p>Which Database to choose.</p></li><li><p>Where to host.</p></li><li><p>What metadata to store together with embeddings. The data to be used for pre-filtering and post-filtering.</p></li><li><p>Indexing strategy.</p></li></ul><p><em>E) </em>Vector Search.</p><ul><li><p>Choice of similarity measure.</p></li><li><p>Choosing the query path - metadata first vs. ANN first.</p></li><li><p>Hybrid search.</p></li></ul><p><em>G) Heuristics</em> - business rules applied to your retrieval procedure.</p><ul><li><p>Time importance.</p></li><li><p>Duplicate context (diversity ranking).</p></li><li><p>Source retrieval.</p></li><li><p>Conditional document preprocessing.</p></li></ul><p><strong>Generation:</strong></p><p><em>A) LLM</em> - Choosing the right Large Language Model to power your application.</p><p><em>B) Prompt Engineering</em> - having context available for usage in your prompts does not free you from the hard work of engineering the prompts. You will still need to align the system to produce outputs that you desire and prevent jailbreak scenarios.</p><p></p><p>After all of this work we have ourselves a working RAG system.</p><p><strong>The harsh truth</strong> - this is rarely good enough to solve real business problems. The accuracy of this kind of system might be low for various reasons. </p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Advanced techniques to improve Naive RAG.</h3><p>Some of the more successfully adopted techniques to continuously improve accuracy of Naive RAG systems are the following:</p><ul><li><p>Query Alteration - there are few techniques that can be employed:</p><ul><li><p>Query rewriting - ask LLM to rewrite original query to better fit the retrieval process. It can be rewritten in multiple ways. E.g. fixing grammar, simplifying the query to keep short succinct statements.</p></li><li><p>Query Expansion - ask LLM to rewrite the query multiple times to create multiple variations of it. Then, run retrieval process multiple times to retrieve more, potentially relevant, context.</p></li></ul></li><li><p>Reranking - rerank the originally retrieved documents using heavier process compared to regular contextual search. Usually, this involves using a larger model and retrieving considerably more documents than needed during the retrieval phase. Reranking also works well with Query Expansion pattern from previous point as it return more data than usual. The overall process is similar to what we are used to seeing in Recommendation Engines. I wrote about such architectures a while ago here (second paragraph):</p><p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;70a5490a-d438-4e62-9bbf-e728054e8bcc&quot;,&quot;caption&quot;:&quot;&#128075; This is Aurimas. I write the weekly SAI Newsletter where my goal is to present complicated Data related concepts in a simple and easy to digest way. The goal is to help You UpSkill in Data Engineering, MLOps, Machine Learning and Data Science areas.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;SAI #10: Airflow - Architecture, Model Deployment - AutoScaling and more...&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2022-12-17T07:13:14.042Z&quot;,&quot;cover_image&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fbucketeer-e05bbc84-baa3-437e-9518-adb32be77984.s3.amazonaws.com%2Fpublic%2Fimages%2F0117d10e-28a9-4a7d-af01-4201afd1b66a_4051x4899.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/sai-10-airflow-architecture-model&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:91034339,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:20,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p></li><li><p>Fine-Tuning of the embedding model - retrieval of data in some domains (e.g. medical data) works poorly with base embedding models. This is where you might need to fine-tune your own embedding model.</p></li></ul><p>Let&#8217;s move on to some other advanced RAG techniques and architectures.</p><p></p><h4>Contextual Retrieval.</h4><p>The idea of Contextual Retrieval was suggested by the Anthropic team late last year. It aims to improve accuracy and relevance of data that is retrieved in Retrieval Augmented Generation based AI systems.</p><p>I love the intuitiveness and simplicity of Contextual Retrieval. And it does provide good results.</p><p>Here are the steps of implementation:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o30y!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o30y!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 424w, https://substackcdn.com/image/fetch/$s_!o30y!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 848w, https://substackcdn.com/image/fetch/$s_!o30y!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!o30y!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o30y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png" width="1456" height="917" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f24cfff9-8064-4b59-ba96-5f659e1c1aed_2098x1321.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:917,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:251950,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff24cfff9-8064-4b59-ba96-5f659e1c1aed_2098x1321.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o30y!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 424w, https://substackcdn.com/image/fetch/$s_!o30y!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 848w, https://substackcdn.com/image/fetch/$s_!o30y!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!o30y!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc0e1d5c5-5978-4bd1-81bd-9fc7b9920267_2098x1321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Contextual Retrieval.</figcaption></figure></div><p><em>Preprocessing:</em></p><ol><li><p>Split each of your documents into chunks via chosen chunking strategy.</p></li><li><p>For each chunk separately, add it to a prompt together with the whole document.</p></li><li><p>Include instructions to situate the chunk in the document and generate short context for it. Pass the prompt to a chosen LLM.</p></li><li><p>Combine the context that was generated in the previous step and the chunk that the context was generated for.</p></li><li><p>Pass the data through a TF-IDF embedder.</p></li><li><p>Pass the data through a LLM based embedding model.</p></li><li><p>Store the data generated in steps 5. and 6. in databases that support efficient search.</p></li></ol><p><em>Retrieval:</em></p><ol start="8"><li><p>Use user query for relevant context retrieval. ANN search for semantic and TF-IDF index for exact search.</p></li><li><p>Use Rank Fusion techniques to combine and deduplicate the retrieved results and produce top N elements.</p></li><li><p>Rerank the previous results and narrow down to top K elements.</p></li><li><p>Pass the result of step 10. to a LLM together with the user query to produce the final answer.</p></li></ol><p>Some thoughts:</p><ul><li><p><em>Step 3.</em> might sound extremely costly and it is, but with Prompt Caching, the costs can be significantly reduced.</p></li><li><p>Prompt caching can be implemented in both proprietary and open source model cases (refer to the next paragraph).</p></li></ul><p></p><h3>The brief emergence of Cache Augmented Generation.</h3><p>At the end of 2024 a white paper briefly shook the social media. An introduction of a technique that would change RAG forever (or would it?) - Cache Augmented Generation. We already know how regular RAG works, here is a brief description of CAG:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!l7bS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!l7bS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 424w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 848w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 1272w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!l7bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png" width="580" height="599.1208791208791" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba496458-c2ad-4acc-a755-b29eeaaf4b12_2467x2549.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1504,&quot;width&quot;:1456,&quot;resizeWidth&quot;:580,&quot;bytes&quot;:448883,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fba496458-c2ad-4acc-a755-b29eeaaf4b12_2467x2549.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!l7bS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 424w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 848w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 1272w, https://substackcdn.com/image/fetch/$s_!l7bS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F406e1c32-9e94-4271-83e5-b17d9fd92174_2467x2549.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">RAG vs. CAG</figcaption></figure></div><ol><li><p>Pre-compute all of the external context into a KV Cache of the LLM. Cache it in memory. This only needs to be done once, the following steps can be run multiple times without recomputing the initial cache.</p></li><li><p>Pass the system prompt including user query and the system prompt with instructions on how cached context should be used by the LLM.</p></li><li><p>Return the generated answer to the user. After this, clear any generations from the cache and keep only the initially cached context. This makes the LLM ready for next generations.</p></li></ol><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p>CAG promised a more accurate retrieval by storing all of the context in the KV cache instead of retrieving just a portion of data each time the answer needs to be computed. The reality?</p><ul><li><p>CAG does not solve the inaccuracies given extremely long context.</p></li><li><p>CAG has many limitations when it comes to data security.</p></li><li><p>Loading the entire internal knowledge base is close to impossible in large organisations.</p></li><li><p>Cache becomes static and adding fresh data is problematic.</p></li></ul><p>Actually, we were using a variant of CAG for some time already after introduction of prompt caching by most of the LLM providers. What we did can be described as a fusion of CAG and RAG and implemented in the following procedure:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fzGb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fzGb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 424w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 848w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 1272w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fzGb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png" width="584" height="505.38461538461536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/122d444c-9613-4904-8f38-bf2aa7226fa0_2276x1969.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1260,&quot;width&quot;:1456,&quot;resizeWidth&quot;:584,&quot;bytes&quot;:342576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F122d444c-9613-4904-8f38-bf2aa7226fa0_2276x1969.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fzGb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 424w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 848w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 1272w, https://substackcdn.com/image/fetch/$s_!fzGb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbfcd46a5-44f5-42e6-9ecf-cbc26ba5384b_2276x1969.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Fusion of RAG and CAG</figcaption></figure></div><p><em>Data Preprocessing:</em></p><ol><li><p>We use only rarely changing data sources for Cache Augmented Generation. On top of the requirement of data changing rarely we should also think about which of the sources are often hit by relevant queries. Once we have this information, only then we pre-compute all of this selected data into a KV Cache of the LLM. Cache it in memory. This only needs to be done once, the following steps can be run multiple times without recomputing the initial cache.</p></li><li><p>For RAG, if necessary, precompute and store vector embeddings in a compatible database to be searched later in step 4. Sometimes simpler data types are enough for RAG, a regular database might suffice.</p></li></ol><p><em>Query Path:</em></p><ol start="3"><li><p>Compose a prompt including user query and the system prompt with instructions on how cached context and retrieved external context should be used by the LLM.</p></li><li><p>Embed a user query to be used for semantic search via vector DBs and query the context store to retrieve relevant data. If semantic search is not required, query other sources, like real time databases or web.</p></li><li><p>Enrich the final prompt with external context retrieved in <em>step 4</em>.</p></li><li><p>Return the final answer to the user.</p></li></ol><p>Let&#8217;s move to the latest evolution - Agentic RAG.</p><p></p><h3>Agentic RAG.</h3><p>Agentic RAG has added two additional components that attempted to reduce inconsistencies when answering complex user queries.</p><ul><li><p>Data Source Routing.</p></li><li><p>Reflection.</p></li></ul><p>Let&#8217;s explore how it works end-to-end.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!YuBY!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!YuBY!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!YuBY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png" width="602" height="479.61538461538464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/768cab89-d1e2-4600-99e7-e7fa9a7c7612_2019x1609.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1160,&quot;width&quot;:1456,&quot;resizeWidth&quot;:602,&quot;bytes&quot;:278511,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159546301?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F768cab89-d1e2-4600-99e7-e7fa9a7c7612_2019x1609.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!YuBY!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!YuBY!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff2741795-0b06-4b88-bbcd-9143fd894a71_2019x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG.</figcaption></figure></div><ol><li><p>Analysis of the user query: we pass the original user query to a LLM based Agent for analysis. This is where:</p><ol><li><p>The original query can be rewritten, sometimes multiple times to create either a single or multiple queries to be passed down the pipeline.</p></li><li><p>The agent decides if additional data sources are required to answer the query. This is where the first step of agency kicks in.</p></li></ol></li><li><p>If additional data is required, the Retrieval step is triggered. This is where Data Source Routing happens, you could have one or multiple data sets available for the Agentic RAG system and the Agent is given agency to choose which ones should be tapped into to answer this specific query. Few examples:</p><ol><li><p>Real time user data. This is a pretty cool concept as we might have some real time information like current location available for the user.</p></li><li><p>Internal documents that a user might be interested in.</p></li><li><p>Data available on the web.</p></li><li><p>&#8230;</p></li></ol></li><li><p>Once the data is retrieved from potentially multiple data sources, we rerank it similarly like in regular RAG. This is also and important step as multiple different data sources utilising different storage technologies can be integrated into the RAG system here. Complexities of the retrieval process can be hidden behind tools given to the agent.</p></li><li><p>We try to compose the answer (or multiple answers or a set of actions) straight via LLM. It can happen in the first round or after the Reflection happens.</p></li><li><p>The answer gets analysed, summarised and evaluated for correctness and relevance:</p><ol><li><p>If the Agent decides that the answer is good enough, it gets returned to the user.</p></li><li><p>If the Agent decides that the answer needs improvement, we try to rewrite the user query and repeat the generation loop. This is where the second difference of Regular vs. Agentic RAG is implemented.</p></li></ol></li></ol><p></p><p>Recently open-sourced MCP project by Anthropic can supercharge the development of Agentic RAG applications. Learn how in one of my previous Newsletter episodes:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;28530aa5-6422-4f25-ba1a-9d6c5e19ae5f&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Everything you need to know about MCP.&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-03-15T15:16:01.285Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c348e1b-a175-4c65-8ea6-d773f957488e_1934x1554.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/everything-you-need-to-know-about&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:159065609,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:81,&quot;comment_count&quot;:6,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p><h3>Wrapping up.</h3><p>We have reviewed the evolution of Modern RAG architectures. RAG is not dead and is not going anywhere anytime soon. I believe the architectures will continue evolving for some time to come and it is a good investment to learn them, and understand when to use what.</p><p>In general, the simpler the better as you will face new challenges while making the system more complex. Some of the emerging challenges include:</p><ul><li><p>Difficulties in evaluating the end-to-end system.</p></li><li><p>Increase in end-to-end latency due to multiple LLM calls.</p></li><li><p>Increase in the price of operation.</p></li><li><p>&#8230;</p></li></ul><p></p><p>This is it! Hope to see you in the next issue.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Everything you need to know about MCP.]]></title><description><![CDATA[And why it is important for how we shape and build Agentic Systems.]]></description><link>https://www.newsletter.swirlai.com/p/everything-you-need-to-know-about</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/everything-you-need-to-know-about</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 15 Mar 2025 15:16:01 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/9c348e1b-a175-4c65-8ea6-d773f957488e_1934x1554.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>MCP (Model Context Protocol) by Anthropic is all over the news this month. I have been following the project since it&#8217;s public release via announcement on 25th of November, 2024. In this article I will share my thoughts on why I believe you should start looking into MCP and why it is important for the future of AI Agents and Agentic systems. Here is the outline of what you will find in the article:</p><ul><li><p>Refresher on AI Agents and Agentic Systems.</p></li><li><p>What is MCP?</p></li><li><p>Splitting control responsibilities through MCP.</p></li><li><p>Evolving AI Agent architecture with MCP.</p></li><li><p>The future roadmap of MCP.</p><p></p></li></ul><div><hr></div><h4><strong>The Only Cloud-Native Kafka Implementation with Jepsen Validation.</strong></h4><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!psZy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!psZy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 424w, https://substackcdn.com/image/fetch/$s_!psZy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 848w, https://substackcdn.com/image/fetch/$s_!psZy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 1272w, https://substackcdn.com/image/fetch/$s_!psZy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!psZy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg" width="328" height="139.99507995079952" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/afa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:347,&quot;width&quot;:813,&quot;resizeWidth&quot;:328,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Buf&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Buf" title="Buf" srcset="https://substackcdn.com/image/fetch/$s_!psZy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 424w, https://substackcdn.com/image/fetch/$s_!psZy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 848w, https://substackcdn.com/image/fetch/$s_!psZy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 1272w, https://substackcdn.com/image/fetch/$s_!psZy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fafa16af2-4d77-44c5-ba19-72225b35d338_813x347.svg 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>Bufstream offers a robust streaming platform built on Protobuf, designed for high-throughput, low-latency data pipelines. Its architecture prioritizes consistency and fault tolerance, as demonstrated by rigorous Jepsen testing - plus it's up to 8x cheaper than self-managed Kafka. Explore the <a href="https://fnf.dev/3Fp0qx6">Jepsen Report</a> for detailed analysis of its performance under various failure scenarios. Implement reliable, schema-driven streaming with <a href="https://fnf.dev/3Fp0qx6">Bufstream</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://fnf.dev/3Fp0qx6&quot;,&quot;text&quot;:&quot;Check it out.&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://fnf.dev/3Fp0qx6"><span>Check it out.</span></a></p><div><hr></div><p></p><h3>Refresher on AI Agents and Agentic Systems.</h3><p>In it&#8217;s simplest high level definition, an AI agent is an application that uses LLM at the core as it&#8217;s reasoning engine to decide on the steps it needs to take to solve for users intent. It is usually explained via an image similar to the picture bellow and is composed of multiple building blocks:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fVcp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fVcp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 424w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 848w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1272w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png" width="1456" height="1094" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c44a6b00-0d44-4efd-8fab-28e035b662d2_2926x2198.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1094,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:294780,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!fVcp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 424w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 848w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1272w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI Agent</figcaption></figure></div><ul><li><p><em>Planning</em> - the capability to plan a sequence of actions that the application needs to perform in order to solve for the provided intent. There are many strategies to this, I have written an article about one of them - Reflection, where we built it from scratch without using any LLM Orchestration frameworks: </p><p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;f2e91b4e-0884-4c59-88f9-a612840054e5&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building AI Agents from scratch - Part 2: Reflection and Working Memory&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2025-01-04T08:23:13.733Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50268cdf-10a7-48f2-bc57-e7e31e5516f1_2882x2349.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part-8ca&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:153986635,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:67,&quot;comment_count&quot;:5,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p></li><li><p><em>Memory</em> - short-term and long-term memory containing any information that the agent might need to reason about the actions it needs to take. This information is usually passed to LLM via a system prompt as part of the core. You can read more about different types of memories in one of my previous articles:</p><p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;2ca6f7c8-b263-4458-a4cd-c5ade8d57584&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory in Agent Systems&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-10-30T10:03:28.773Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7650705-54b4-49a3-91a4-aad0c4093c4b_2926x2198.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/memory-in-agent-systems&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:150888366,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:63,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p></p></li><li><p><em>Tools</em> - any function that the application can call to enhance it&#8217;s reasoning capabilities. One should not be fooled by the simplicity of this definition as a tool can be literally anything:</p><ul><li><p>Simple functions defined in code.</p></li><li><p>VectorDBs and other data stores containing context.</p></li><li><p>Regular Machine Learning model APIs.</p></li><li><p>Other Agents!</p></li><li><p>&#8230;</p></li></ul><p>Here is an article where I implement tool use patter from scratch: </p><p></p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;293a9336-90e0-44ac-9634-94ad57a422a2&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building AI Agents from scratch - Part 1: Tool use&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-12-21T10:30:19.983Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1144abe7-1fb8-4190-b32d-6e59647c858b_2974x2388.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:153433846,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:253,&quot;comment_count&quot;:18,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></li></ul><p></p><p>Not all Agentic Systems are given full agency over execution in the environment. Anthropic has also described &#8220;Augmented LLMs&#8221; where application integrating LLMs as reasoning engines are only given control over tools and memory but not planning. The topology on how the interactions happen is defined in code rather than planned out by the LLM.</p><p></p><h3>What is MCP?</h3><p>MCP (Model Context Protocol) as defined by Anthropic is:</p><blockquote><p>An open protocol that standardizes how applications provide context to LLMs.</p></blockquote><p>To be more precise it attempts to standardise the protocol on how LLM based applications integrate with other environments.</p><p>In Agentic systems, AI Agents or chains of augmented LLMs the context can be provided in multiple ways:</p><ul><li><p>External data - this is part of long term memory.</p></li><li><p>Tools - the capability of the system to interact with the environment.</p></li><li><p>Dynamic Prompts - that can be injected as part of the system prompt.</p></li><li><p>&#8230;</p></li></ul><p>Bellow is the high level architecture of MCP.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!528O!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!528O!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 424w, https://substackcdn.com/image/fetch/$s_!528O!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 848w, https://substackcdn.com/image/fetch/$s_!528O!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1272w, https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png" width="1219" height="684" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa77772f-3a47-417a-9780-c6942544f7db_1219x684.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5aa88a15-92ce-47f6-bf2a-265ac8f970ad_1219x684.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:684,&quot;width&quot;:1219,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:102333,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5aa88a15-92ce-47f6-bf2a-265ac8f970ad_1219x684.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!528O!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 424w, https://substackcdn.com/image/fetch/$s_!528O!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 848w, https://substackcdn.com/image/fetch/$s_!528O!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1272w, https://substackcdn.com/image/fetch/$s_!528O!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa77772f-3a47-417a-9780-c6942544f7db_1219x684.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>MCP Host - Programs using LLMs at the core that want to access data through MCP.</p></li><li><p>MCP Client - Clients that maintain 1:1 connections with servers.</p></li><li><p>MCP Server - Lightweight programs that each expose specific capabilities through the standardized Model Context Protocol.</p></li><li><p>Local Data Sources - Your computer&#8217;s files, databases, and services that MCP servers can securely access.</p></li><li><p>Remote Data Sources - External systems available over the internet (e.g., through APIs) that MCP servers can connect to.</p></li></ol><p></p><h4>Why the need to standardise?</h4><p>Current development flow of Agentic applications is chaotic:</p><ul><li><p>There are many Agent frameworks with slight differences. While it is encouraging to see the ecosystem flourish, these slight difference rarely add enough value but potentially significantly change the way you write code.</p></li><li><p>Integrations with external data sources are usually implemented ad-hoc and using different protocols even within organisations. That is clearly true for different companies as well.</p></li><li><p>Tools are defined in code repositories in slightly different ways. How you attach tools to augmented LLMs is different as well.</p></li></ul><p>Eventually, the goal is to improve the velocity of how fast we can innovate with Agentic applications, how well we can secure them and how easy it is to bring relevant data to the context.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Splitting control responsibilities through MCP.</h3><p>MCP Servers expose three main elements that are purposely built in a way that helps implement specific control segregation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UVok!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UVok!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 424w, https://substackcdn.com/image/fetch/$s_!UVok!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 848w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1272w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png" width="539" height="385.9207772795217" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9771f5e6-063d-46f2-bcff-e19f2a36bcdc_669x479.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:479,&quot;width&quot;:669,&quot;resizeWidth&quot;:539,&quot;bytes&quot;:42132,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9771f5e6-063d-46f2-bcff-e19f2a36bcdc_669x479.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UVok!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 424w, https://substackcdn.com/image/fetch/$s_!UVok!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 848w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1272w, https://substackcdn.com/image/fetch/$s_!UVok!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F697ed6d4-dd3a-418a-9301-f4a1d44f255e_669x479.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Prompts are designed to be User-Controlled.</p><ul><li><p>Programmer of the server can expose specific prompts (suited for interaction with data exposed by the server) that can be injected into the application using LLMs and exposed to the user of the given application.</p></li></ul></li><li><p>Resources are designed to be Application-Controlled.</p><ul><li><p>Resources are any kind of data (text or binary) that can be used by the application built to utilise LLMs. The programmer of the application (usually AI Engineer) is responsible of codifying how this information should be used by the application. Usually, there is not automation in that and LLM does not participate in this choice.</p></li></ul></li><li><p>Tools are designed to be Model-Controlled.</p><ul><li><p>If we provide agency to our application of how it should interact with the environment we use tools to do that. MCP Server exposes an endpoint that can list all of the tools available with their descriptions and required arguments, application can pass this list to the LLM so that it can decide which tools are needed for the task at hand and how they should be invoked.</p></li></ul></li></ul><h3></h3><h3>Evolving AI Agent architecture with MCP.</h3><p>To explain how the architecture of Agentic application could evolve with MCP, let&#8217;s take an example of a very simple Agentic RAG.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!STC4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!STC4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!STC4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!STC4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!STC4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!STC4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png" width="646" height="514.6703296703297" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/131e25fc-b608-457a-ba5c-2145702eff2e_2019x1609.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1160,&quot;width&quot;:1456,&quot;resizeWidth&quot;:646,&quot;bytes&quot;:283190,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F131e25fc-b608-457a-ba5c-2145702eff2e_2019x1609.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!STC4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!STC4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!STC4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!STC4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bbdd713-83fb-434c-85a5-49ebbb2ff052_2019x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG</figcaption></figure></div><p>Here are the steps the System could involve:</p><ol><li><p>Analysis of the user query: we pass the original user query to a LLM based Agent for analysis. This is where:</p><ol><li><p>The original query can be rewritten, sometimes multiple times to create either a single or multiple queries to be passed down the pipeline.</p></li><li><p>The agent decides if additional data sources are required to answer the query.</p></li></ol></li><li><p>If additional data is required, the Retrieval step is triggered. In Agentic RAG case, we could have a single or multiple agents responsible for figuring out what data sources should be tapped into, few examples:</p><ol><li><p>Real time user data. This is a pretty cool concept as we might have some real time information like current location available for the user.</p></li><li><p>Internal documents that a user might be interested in.</p></li><li><p>Data available on the web.</p></li><li><p>&#8230;</p></li></ol></li><li><p>If there is no need for additional data, we try to compose the answer (or multiple answers or a set of actions) straight via an LLM.</p></li><li><p>The answer gets analyzed, summarized and evaluated for correctness and relevance:</p><ol><li><p>If the Agent decides that the answer is good enough, it gets returned to the user.</p></li><li><p>If the Agent decides that the answer needs improvement, we try to rewrite the user query and repeat the generation loop.</p></li></ol></li></ol><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>How does the architecture change if we introduce MCP into the picture?</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!2OF_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!2OF_!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 424w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 848w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!2OF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png" width="624" height="637.7142857142857" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d809f2b-4486-4aa5-a48f-c8d4dde86c57_1574x1609.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1488,&quot;width&quot;:1456,&quot;resizeWidth&quot;:624,&quot;bytes&quot;:238906,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d809f2b-4486-4aa5-a48f-c8d4dde86c57_1574x1609.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!2OF_!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 424w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 848w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!2OF_!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F558b1003-fe05-4fdd-9272-b0cf3533021c_1574x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG with MCP Server</figcaption></figure></div><p>We can introduce MCP servers to face all data sources relevant for the retrieval procedure. The MCP server handles retrieval logic through Tools since the LLM will be &#8220;choosing&#8221; which data sources will be relevant for the system. Here are some benefits to this approach:</p><ul><li><p>We decouple retrieval logic from the topology of the Agentic system.</p></li><li><p>We can then evolve the retrieval component separately:</p><ul><li><p>Introduce additional tools.</p></li><li><p>Introduce additional data sources.</p></li><li><p>Version, evolve and rollback existing tools and data sources.</p></li></ul></li><li><p>We can manage security and access to the data via the MCP server.</p></li><li><p>A separate team can be independently working on the data it is responsible for.</p></li></ul><p></p><h4>Evolving of the architecture in larger enterprises.</h4><p>As enterprises grow in size, different teams start owning specific data assets. E.g.:</p><ul><li><p>CRM Data.</p></li><li><p>Financial Data.</p></li><li><p>Real time Web clickstream data.</p></li><li><p>&#8230;.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!8bCW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!8bCW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 424w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 848w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!8bCW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png" width="620" height="633.6263736263736" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/656d5607-f09c-467c-b61b-569d70a106b4_1574x1609.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1488,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:247764,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/159065609?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F656d5607-f09c-467c-b61b-569d70a106b4_1574x1609.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!8bCW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 424w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 848w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!8bCW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef24f8ad-5c6e-4b3e-8266-5f73342a5cc5_1574x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG with multiple MCP Servers</figcaption></figure></div><p>It is hard to effectively manage these disparate data sources and MCP brings a lot of value to the game.</p><ul><li><p>Each data domain can manage their own MCP Servers.</p></li><li><p>All the MCP servers will use the same protocol.</p></li><li><p>Because of the above, the integration efforts for LLM based applications will be significantly reduced.</p></li><li><p>AI Engineers can continue to focus on the topology of the Agent.</p></li></ul><p><strong>[IMPORTANT]: </strong>This is one of the biggest advantages of using MCP in my opinion. It allows the decoupling of systems in large projects. Different teams can work on their own domains while not disturbing development of the main Agentic topology.</p><p></p><h3>The future roadmap of MCP.</h3><p>The public roadmap for the next 6 months strongly suggests the strengthening of Cloud Native aspect of the project. Improvements in:</p><ul><li><p>Authentication and authorisation.</p></li><li><p>Service Discovery.</p></li></ul><p>As well as expanding support for the future of Agentic Systems with focus on:</p><ul><li><p>Hierarchical Agent Systems</p></li><li><p>Interactive Workflows to improve human in the loop interactions.</p></li><li><p>Streaming of Results.</p></li></ul><p></p><h3>That&#8217;s it for today, let&#8217;s sum up.</h3><p>While MCP is still relatively rough around the edges, the roadmap does look promising. The fact that the project has the backing by Anthropic also draws a bright future for it when it comes to adoption.</p><p>You should keep your eye on the project and potentially start adopting it!</p><p>More hands-on content utilising MCP in the future episodes.</p><div><hr></div><p>Once again, thank you to the sponsors of this Newsletter - <a href="https://fnf.dev/3Fp0qx6">Bufstream</a>.</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Building Deep Research Agent from scratch]]></title><description><![CDATA[Let's build a Deep Research Agent to help you in your day to day work powered by DeepSeek R1 from scratch.]]></description><link>https://www.newsletter.swirlai.com/p/building-deep-research-agent-from</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/building-deep-research-agent-from</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Tue, 11 Mar 2025 08:03:16 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/d70c1604-8ab3-4476-8ef6-f8f8f5a4eadd_3040x2289.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>The big topic of last month was Deep Research Agents that every large player in the LLM industry is building and trying to monetise. The big catalyst for this was the emergence of DeepSeek R1 reasoning model and its open source nature.</p><p>In this episode of the Newsletter we are going to build such a Deep Research Agentic system from scratch. It will allow us to strengthen our fundamental knowledge of how these upcoming agentic systems actually function under the hood.</p><p>We are also going to run our system utilising one of the previously mentioned LLMs DeepSeek R1. </p><div><hr></div><p>This brings us to our sponsors of today - SambaNova. The platform will allow us to run and test our system for free. Registering to the platform will give you $5 of credits with no credit card details needed, this will be enough to run the project.</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!IcK9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!IcK9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 424w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 848w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 1272w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!IcK9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png" width="590" height="98.06318681318682" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:242,&quot;width&quot;:1456,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:43361,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!IcK9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 424w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 848w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 1272w, https://substackcdn.com/image/fetch/$s_!IcK9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F028517d4-05d7-44d1-9d01-41c0d2c218e1_3972x660.png 1456w" sizes="100vw" fetchpriority="high"></picture><div></div></div></a></figure></div><p>SambaNova provides a selection of models in Llama, Qwen and DeepSeek families through their APIs for applications and a Playground for exploratory purposes. When it comes to DeepSeek models, they provide both distilled and non-distilled 671 Billion parameter versions. In the examples we will be running the 671B version but you can always switch to other versions.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://fnf.dev/4aVUqro&quot;,&quot;text&quot;:&quot;Check it out&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://fnf.dev/4aVUqro"><span>Check it out</span></a></p><p>Thank you for helping keep SwirlAI Newsletters free for everyone!</p><div><hr></div><h3>What are Deep Research Agents?</h3><p>To put it simply, these are systems that are capable of running an in depth research on a predefined topic. Usually, this would include at least the following steps:</p><ul><li><p>Planning of the research - this could mean a creation of an outline of the research report that would eventually become the output of the system.</p></li><li><p>Splitting the above into manageable steps.</p></li><li><p>Performing deep research on sections of the report. This means reasoning about the data needed to provide comprehensive analysis and utilising web search tools to support the analysis.</p></li><li><p>Reflecting on data generated in different steps of the research and improving the results.</p></li><li><p>Summarising the retrieved data and coming back with the final Research Report.</p></li></ul><p></p><p>Today we will implement all of the above without using any LLM Orchestration framework.</p><p>You can find the code (Notebook included for interactive learning) in my &#8220;AI Engineers Handbook&#8221; GitHub repository:</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/deep_research_agent&quot;,&quot;text&quot;:&quot;GitHub Repository&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/deep_research_agent"><span>GitHub Repository</span></a></p><p></p><p>Follow and Star the repository if you like the content!</p><p>For a step-by-step explanation, continue reading.</p><p></p><h3>The System Topology.</h3><p>Below picture represents what we are going to build, here are the steps that the outcome system will perform:</p><ol><li><p>A user will provide a query or topic to be researched.</p></li><li><p>A LLM will create an outline of the final report that it will be aiming for. It will be instructed to produce not more than a certain number of paragraphs.</p></li><li><p>Each of the paragraph description will be fed into a research process separately to produce a comprehensive set of information to be used in report construction. Detailed description of the research process will be outlined in the next section.</p></li><li><p>All of the information will be fed into summarisation step that will construct the final report including conclusion.</p></li><li><p>The report will then be delivered to the user in MarkDown form.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!NNAB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!NNAB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 424w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 848w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 1272w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!NNAB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png" width="1456" height="1008" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/05ff06c5-511d-42cb-a84b-8f315267c81a_1932x1337.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1008,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:205276,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F05ff06c5-511d-42cb-a84b-8f315267c81a_1932x1337.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!NNAB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 424w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 848w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 1272w, https://substackcdn.com/image/fetch/$s_!NNAB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F6e3d9c09-537d-4b0f-a198-5745fe2194b5_1932x1337.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Deep Research Agent Topology</figcaption></figure></div><p></p><h4>The Research Step.</h4><p>Let&#8217;s zoom into the research step defined in the previous paragraph:</p><ol><li><p>Once we have the outline of each paragraph, it will be passed to a LLM to construct Web Search queries in an attempt to best enrich the information needed.</p></li><li><p>The LLM will output the search query and the reasoning behind it.</p></li><li><p>We will execute Web search against the query and retrieve top relevant results.</p></li><li><p>The results will be passed to the Reflection step where a LLM will reason about any missed nuances to try and come up with a search query that would enrich the initial results.</p></li><li><p>This process will be repeated for n times in an attempt to get the best set of information possible.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fIgB!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fIgB!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 424w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 848w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 1272w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fIgB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png" width="1456" height="895" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9bae0484-e22f-47d7-91a0-3b3c827822c3_1673x1028.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:895,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:152083,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9bae0484-e22f-47d7-91a0-3b3c827822c3_1673x1028.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fIgB!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 424w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 848w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 1272w, https://substackcdn.com/image/fetch/$s_!fIgB!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcd105c4b-1aa7-4281-bcb2-b2bfde71f04d_1673x1028.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Research Step Topology</figcaption></figure></div><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Implementing the Agent.</h3><p></p><p>Before we go into the implementation stage let&#8217;s do some technical hygiene. If you haven&#8217;t done that yet, go to SambaNova Cloud console and get your API key, we will use it to explain some important output characteristics of DeepSeek R1 model family.</p><p>You can register <a href="https://fnf.dev/4aVUqro">here</a>.</p><p>Go to &#8220;APIs&#8221; tab, you will get prompted to login, do that. Don&#8217;t worry, adding your credit card details is optional, we will be able to run this project without the need to do that.</p><p>To run queries against the API we will use OpenAI client. If you don&#8217;t have it yet, simply run.</p><pre><code>pip install openai</code></pre><p>For the project we will use a non-distilled DeepSeek-R1 version with 671B parameters. </p><p><em>If you can&#8217;t access it yet by the time of reading the newsletter, join the waitlist and switch the API endpoint and model version to a smaller distilled version. </em></p><p>Makes sure that your SAMBANOVA_API_KEY is exported as an environment variable and run the following in your console or notebook:</p><pre><code>import os
import openai

client = openai.OpenAI(
    api_key=os.environ.get("SAMBANOVA_API_KEY"),
    base_url="https://preview.snova.ai/v1",
)

response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content":"You are a helpful assistant"},
              {"role":"user","content":"Tell me something interesting about human species"}],
    temperature=1
)

print(response.choices[0].message.content)</code></pre><p>You should see something similar to:</p><pre><code>&lt;think&gt;
Okay, so I'm trying to ... &lt;REDACTED&gt;
&lt;/think&gt;

The human species is distinguished by the remarkable cognitive abilities of the brain, which underpin a array of unique traits. Our brain's advanced structure and function enable complex thought, language, and social organization. These capabilities have driven innovation, art, and the creation of intricate societies, setting humans apart in their ability to adapt, innovate, and create beyond any other species. This cognitive prowess is the cornerstone of human achievement and our profound impact on the world.</code></pre><p>Reasoning tokens will always be included in the answer. While it is interesting to see the thinking process, what we will need in our systems is the answers only. This is where we can create a hygiene function to remove anything between the <em>&lt;think&gt;</em> tags.</p><pre><code>def remove_reasoning_from_output(output):

    return output.split("&lt;/think&gt;")[-1].strip()</code></pre><p>Simple yet useful.</p><p>Great! We have now set up the SambaNova account and understood the output structure of DeepSeek R1 family models, let&#8217;s go implement the Deep Research Agent.</p><p></p><h4>Part 1: Defining the State.</h4><p>First, we will need to define the state of the entire system that will be continuously evolved while the Agent is running in the environment and used by different parts of the system selectively.</p><p>Let&#8217;s relate the state to the Stages of the Agentic system:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!phCt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!phCt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 424w, https://substackcdn.com/image/fetch/$s_!phCt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 848w, https://substackcdn.com/image/fetch/$s_!phCt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 1272w, https://substackcdn.com/image/fetch/$s_!phCt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!phCt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png" width="1456" height="1038" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/abee361c-ec22-4634-97dc-6eed4a8e0815_1980x1412.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1038,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:264645,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fabee361c-ec22-4634-97dc-6eed4a8e0815_1980x1412.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!phCt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 424w, https://substackcdn.com/image/fetch/$s_!phCt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 848w, https://substackcdn.com/image/fetch/$s_!phCt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 1272w, https://substackcdn.com/image/fetch/$s_!phCt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F0c697471-715b-4b6a-8b70-0b7db1c13b60_1980x1412.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Topology State</figcaption></figure></div><ul><li><p><em>Stage 1</em> will be the creation of the outline where report structure will be planned and its state evolved. We will start with an empty state but evolve it to something similar (reasoning is described in stage 2):</p><pre><code>{
    "report_title": "Report Title",
    "paragraphs": [
        {
            "title": "Paragraph Title",
            "content": "Paragraph Content",
            "research": &lt;...&gt;
        },
        {
            "title": "Paragraph Title",
            "content": "Paragraph Content",
            "research": &lt;...&gt;
        }
    ]
}</code></pre><p>The state can be implemented in a clean way using python dataclasses. The above would look like:</p><pre><code>@dataclass
class Paragraph:
    title: str = ""
    content: str = ""
    research: Research = field(default_factory=Research)

@dataclass
class State:
    report_title: str = ""
    paragraphs: List[Paragraph] = field(default_factory=list)</code></pre></li><li><p><em>Stage 2 </em>is be where we will iterate on the state of each paragraph. We will change the <em>&#8220;research&#8221; </em>of each paragraph. We will use the following structure of the research state per paragraph:</p><pre><code>{
"search_history": [{"url": "some url", "content": "some content"}],
"latest_summary": "summary of the combined search history",
"reflection_iteration": 1
}</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uwxe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uwxe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 424w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 848w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 1272w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uwxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png" width="1456" height="895" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3b874157-04a3-44eb-9550-cd407cee7482_1673x1028.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:895,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:146929,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3b874157-04a3-44eb-9550-cd407cee7482_1673x1028.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uwxe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 424w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 848w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 1272w, https://substackcdn.com/image/fetch/$s_!Uwxe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae2194f5-e847-4f07-9bf4-cf5f5c3baa0f_1673x1028.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Research Step</figcaption></figure></div><p><em>search_history</em> - We will store all of the searches we perform in a list, we will want both the url and the content so that we can deduplicate search results and refer to the links later when forming the final report.</p><p><em>latest_summary</em> - the summarised version of the paragraph given all of the search results. It will be used in reflection step to figure out if more search is needed and passed to the next step of summarisation and report creation.</p><p><em>reflection_iteration</em> - this is to track the current number of reflection iteration and force stop if the limit is reached.</p><p></p><p>Again, we can implement research state via dataclasses:</p><pre><code>@dataclass
class Search:
    url: str = ""
    content: str = ""

@dataclass
class Research:
    search_history: List[Search] = field(default_factory=list)
    latest_summary: str = ""
    reflection_iteration: int = 0</code></pre></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><h4>Part 2: Creating the report outline.</h4><p>Different versions of the models will have varying consistency with the answers they produce. I experimented with <em>DeepSeek-R1 </em>a bunch and the following prompt seemed to produce consistently well formatted outputs:</p><pre><code>output_schema_report_structure = {
        "type": "array",
        "items": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "content": {"type": "string"}
            }
        }
    }

SYSTEM_PROMPT_REPORT_STRUCTURE = f"""
You are a Deep Research assistan. Given a query, plan a structure for a report and the paragraphs to be included.
Make sure that the ordering of paragraphs makes sense.
Once the outline is created, you will be given tools to search the web and reflect for each of the section separately.
Format the output in json with the following json schema definition:

&lt;OUTPUT JSON SCHEMA&gt;
{json.dumps(output_schema_report_structure, indent=2)}
&lt;/OUTPUT JSON SCHEMA&gt;

Title and content properties will be used for deeper research.
Make sure that the output is a json object with an output json schema defined above.
Only return the json object, no explanation or additional text.
"""</code></pre><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uzoh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uzoh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 424w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 848w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 1272w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uzoh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png" width="516" height="495.16299559471366" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a757086d-4f68-4111-9b09-707f13f3210e_1362x1307.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1307,&quot;width&quot;:1362,&quot;resizeWidth&quot;:516,&quot;bytes&quot;:137862,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157875435?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa757086d-4f68-4111-9b09-707f13f3210e_1362x1307.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uzoh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 424w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 848w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 1272w, https://substackcdn.com/image/fetch/$s_!uzoh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb01ea8db-e2bb-436d-b602-e6edacda0da8_1362x1307.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Paragraph Structure State</figcaption></figure></div><p>Let&#8217;s run some sample query with the above system prompt:</p><pre><code>response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content":SYSTEM_PROMPT_REPORT_STRUCTURE},
              {"role":"user","content":"Tell me something interesting about human species"}],
    temperature=1
)

print(response.choices[0].message.content)</code></pre><p>You will get something similar to:</p><pre><code>```json
[
  {
    "title": "Introduction to Human Adaptability",
    "content": "Humans possess a unique capacity for adaptability, which has been crucial in their survival and dominance across various environments. This introduction sets the stage for exploring the different facets of human adaptability."
  },
  ...
  &lt;REDACTED&gt;
  ...
  {
    "title": "Conclusion: The Role of Adaptability in Human Survival",
    "content": "Adaptability has been a cornerstone of human survival and evolution, enabling us to face challenges and explore new frontiers, offering insights into future potential."
  }
]
```</code></pre><p>These json tags surrounding the output are not helpful as we will need to transform the output into Python dictionary. Very simple function to remove first and last line of the output:</p><pre><code>def clean_json_tags(text):

    return text.replace("```json\n", "").replace("\n```", "")</code></pre><p>Here is the properly cleaned output:</p><pre><code>json.loads(clean_json_tags(remove_reasoning_from_output(response.choices[0].message.content)))</code></pre><p>We can now use above as an input to our global state directly.</p><pre><code>STATE = State()

report_structure = json.loads(clean_json_tags(remove_reasoning_from_output(response.choices[0].message.content)))

for paragraph in report_structure:
    STATE.paragraphs.append(Paragraph(title=paragraph["title"], content=paragraph["content"]))</code></pre><p></p><h4>Part 3: The Web Search tool.</h4><p>We will be using Tavily to perform web search. You can get your token <a href="https://app.tavily.com/">here</a>.</p><p>The tool for it is very simple:</p><pre><code>def tavily_search(query, include_raw_content=True, max_results=5):

    tavily_client = TavilyClient(api_key=os.getenv("TAVILY_API_KEY"))

    return tavily_client.search(query,
                                include_raw_content=include_raw_content,
                                max_results=max_results)</code></pre><p>Each of the function call will return up to <em>max_results</em> of the web search results and for each search result will return:</p><ul><li><p>Title of the search result.</p></li><li><p>URL of the search result.</p></li><li><p>Summary of the content.</p></li><li><p>Full content of the page if possible. We want this for best results.</p></li></ul><p>Once we get the results from the Web Search tool, we can add all of it to the global state directly without need for calling any LLM but we will need to make sure that we are updating the proper element in the list of paragraphs.</p><p>Basically, given the structure we defined:</p><pre><code>{
    "report_title": "Report Title",
    "paragraphs": [
        {
            "title": "Paragraph Title",
            "content": "Paragraph Content",
            "research": &lt;...&gt;
        },
        {
            "title": "Paragraph Title",
            "content": "Paragraph Content",
            "research": &lt;...&gt;
        }
    ]
}</code></pre><p>And given that we are currently researching the i<em>th </em>paragraph, we will need to be updating the field .paragraphs[i].research.</p><p>Remembering the structure of the research field:</p><pre><code>{
"search_history": [{"url": "some url", "content": "some content"}],
"latest_summary": "summary of the combined search history",
"reflection_iteration": 1
}</code></pre><p>Here is a handy function that will update the state correctly provided Tavily search results, index of the paragraph and the state object.</p><pre><code>def update_state_with_search_results(search_results, idx_paragraph, state):
    
    for search_result in search_results["results"]:
        search = Search(url=search_result["url"], content=search_result["raw_content"])
        state.paragraphs[idx_paragraph].research.search_history.append(search)

    return state</code></pre><p>We will be appending to the search history. <em>latest_summary</em> and <em>reflection_iteration</em> fields will need some work by LLM and will be discussed in <strong>Part 5: Reflecting</strong>.</p><p></p><h4>Part 4: Planning the Search.</h4><p>To plan the first instance of a search I found the following prompt to produce consistently good results:</p><pre><code>input_schema_first_search = {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "content": {"type": "string"}
            }
        }

output_schema_first_search = {
            "type": "object",
            "properties": {
                "search_query": {"type": "string"},
                "reasoning": {"type": "string"}
            }
        }

SYSTEM_PROMPT_FIRST_SEARCH = f"""
You are a Deep Research assistant. You will be given a paragraph in a report, it's title and expected content in the following json schema definition:

&lt;INPUT JSON SCHEMA&gt;
{json.dumps(input_schema_first_search, indent=2)}
&lt;/INPUT JSON SCHEMA&gt;

You can use a web search tool that takes a 'search_query' as parameter.
Your job is to reflect on the topic and provide the most optimal web search query to enrich your current knowledge.
Format the output in json with the following json schema definition:

&lt;OUTPUT JSON SCHEMA&gt;
{json.dumps(output_schema_first_search, indent=2)}
&lt;/OUTPUT JSON SCHEMA&gt;

Make sure that the output is a json object with an output json schema defined above.
Only return the json object, no explanation or additional text.
"""</code></pre><p>We ask for reasoning in the output schema just to force more thought around the query. While it is most likely an overkill for a reasoning model it might be a good idea with a regular LLM. While we are using DeepSeek R1 for this, we don&#8217;t necessarily need to. Reasoning models are specifically handy in the first step of the Deep Research Agent where planning of the report structure is required.</p><p>Given the fact that we now have a list of paragraphs planned with their content and descriptions, we can feed output of part 3 directly to the prompt, this is how it would look like:</p><pre><code>response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content":SYSTEM_PROMPT_FIRST_SEARCH},                {"role":"user","content":json.dumps(STATE.paragraphs[0]])}],
    temperature=1
)

print(response.choices[0].message.content)</code></pre><p><em>STATE.paragraphs[0] </em>points to the first paragraph state where <em>research </em>field is still empty.</p><p>I got the following for my first search plan:</p><pre><code>{"search_query": "Homo sapiens characteristics basic biological traits cognitive abilities behavioral traits"}</code></pre><p>I can directly input the query into my search tool:</p><pre><code>tavily_search("Homo sapiens characteristics basic biological traits cognitive abilities behavioral traits")</code></pre><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Part 5: The first summary.</h4><p>The first summary is different from the upcoming Reflection step as there is nothing yet to reflect on and this step will be producing exactly that. The following prompt works relatively well:</p><pre><code>input_schema_first_summary = {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "content": {"type": "string"},
                "search_query": {"type": "string"},
                "search_results": {
                    "type": "array",
                    "items": {"type": "string"}
                }
            }
        }

output_schema_first_summary = {
            "type": "object",
            "properties": {
                "paragraph_latest_state": {"type": "string"}
            }
        }

SYSTEM_PROMPT_FIRST_SUMMARY = f"""
You are a Deep Research assistan. You will be given a search query, search results and the paragraph a report that you are researching following json schema definition:

&lt;INPUT JSON SCHEMA&gt;
{json.dumps(input_schema_first_summary, indent=2)}
&lt;/INPUT JSON SCHEMA&gt;

Your job is to write the paragraph as a researcher using the search results to align with the paragraph topic and structure it properly to be included in the report.
Format the output in json with the following json schema definition:

&lt;OUTPUT JSON SCHEMA&gt;
{json.dumps(output_schema_first_summary, indent=2)}
&lt;/OUTPUT JSON SCHEMA&gt;

Make sure that the output is a json object with an output json schema defined above.
Only return the json object, no explanation or additional text.
"""</code></pre><p>We now need to provide data to the LLM in the following format:</p><pre><code> {
    "title": "Title",
    "content": "Content",
    "search_query": "Search Query",
    "search_results": []
}</code></pre><p>We can construct the json from data we already have.</p><p>Given the response from:</p><pre><code>search_results = tavily_search("Homo sapiens characteristics basic biological traits cognitive abilities behavioral traits")</code></pre><p>The json would look like:</p><pre><code><code>input = {
    "title": "Introduction to Human Adaptability",
    "content": "Humans possess a unique capacity for adaptability, which has been crucial in their survival and dominance across various environments. This introduction sets the stage for exploring the different facets of human adaptability.",
    "search_query": "Homo sapiens characteristics basic biological traits cognitive abilities behavioral traits",
    "search_results": [result["raw_content"][0:20000] for result in search_results["results"] if result["raw_content"]]
}</code></code></pre><p>We can then run:</p><pre><code>response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content": SYSTEM_PROMPT_FIRST_SUMMARY},
              {"role":"user","content":json.dumps(input)}],
    temperature=1
)

print(remove_reasoning_from_output(response.choices[0].message.content))</code></pre><p>You will get something similar to:</p><pre><code>{
  "paragraph_latest_state": "Homo sapiens, the species to which modern humans belong, represents a unique and fascinating chapter in the evolutionary narrative of life on Earth. As the only living species within the Homo genus, Homo sapiens are distinguished by a combination of biological, cognitive, and behavioral traits that set us apart from other primates and extinct human relatives. Our biological characteristics include a large and structurally advanced brain, with a neocortex that has expanded significantly compared to our evolutionary ancestors. This anatomical development has enabled exceptional cognitive abilities, such as complex problem-solving, abstract thought, and the capacity for language and symbolic communication. Behaviorally, Homo sapiens exhibit sophisticated social structures, cultural practices, and technological innovations, which have been critical in shaping our ability to adapt to diverse environments and thrive as a species. These traits collectively underscore the intricate interplay between biology and behavior that defines the human condition."
}</code></pre><p>This is what we will be updating <em>STATE.paragraphs[0].research.latest_summary </em>field with. We will also reflect on the continuously updated latest state of the paragraph summary as we move forward in <em>Part 6</em>. </p><p></p><h4>Part 6: Reflecting.</h4><p>We now have the latest state of the report paragraph content and will use that to improve the content by prompting our LLM to reflect on the text and look for any angles it might have missed while drafting the piece.</p><p>Here is the prompt that does great job:</p><pre><code>input_schema_reflection = {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "content": {"type": "string"},
                "paragraph_latest_state": {"type": "string"}
            }
        }

output_schema_reflection = {
            "type": "object",
            "properties": {
                "search_query": {"type": "string"},
                "reasoning": {"type": "string"}
            }
        }

SYSTEM_PROMPT_REFLECTION = f"""
You are a Deep Research assistant. You are responsible for constructing comprehensive paragraphs for a research report. You will be provided paragraph title and planned content summary, also the latest state of the paragraph that you have already created all in the following json schema definition:

&lt;INPUT JSON SCHEMA&gt;
{json.dumps(input_schema_reflection, indent=2)}
&lt;/INPUT JSON SCHEMA&gt;

You can use a web search tool that takes a 'search_query' as a parameter.
Your job is to reflect on the current state of the paragraph text and think if you haven't missed some critical aspect of the topic and provide the most optimal web search query to enrich the latest state.
Format the output in json with the following json schema definition:

&lt;OUTPUT JSON SCHEMA&gt;
{json.dumps(output_schema_reflection, indent=2)}
&lt;/OUTPUT JSON SCHEMA&gt;

Make sure that the output is a json object with an output json schema defined above.
Only return the json object, no explanation or additional text.
"""</code></pre><p>For the run we are currently implementing the input would look like this:</p><pre><code>input = {"paragraph_latest_state": "Homo sapiens, the species to which modern humans belong, represents a unique and fascinating chapter in the evolutionary narrative of life on Earth. As the only living species within the Homo genus, Homo sapiens are distinguished by a combination of biological, cognitive, and behavioral traits that set us apart from other primates and extinct human relatives. Our biological characteristics include a large and structurally advanced brain, with a neocortex that has expanded significantly compared to our evolutionary ancestors. This anatomical development has enabled exceptional cognitive abilities, such as complex problem-solving, abstract thought, and the capacity for language and symbolic communication. Behaviorally, Homo sapiens exhibit sophisticated social structures, cultural practices, and technological innovations, which have been critical in shaping our ability to adapt to diverse environments and thrive as a species. These traits collectively underscore the intricate interplay between biology and behavior that defines the human condition.",
            "title": "Introduction",
            "content": "The human species, Homo sapiens, is one of the most unique and fascinating species on Earth. This section will introduce the basic characteristics of humans and set the stage for exploring interesting aspects of the species."}</code></pre><p>As before, let&#8217;s run the following:</p><pre><code>response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content": SYSTEM_PROMPT_REFLECTION},
              {"role":"user","content":json.dumps(input)}],
    temperature=1
)

print(remove_reasoning_from_output(response.choices[0].message.content))</code></pre><p>The output:</p><pre><code>{
  "search_query": "Recent research on Homo sapiens evolution, interaction with other human species, and factors contributing to their success",
  "reasoning": "The current paragraph provides a good overview of Homo sapiens' characteristics but lacks depth on evolutionary history and interactions with other species. Including recent research on these topics will enhance the paragraph's comprehensiveness and provide up-to-date information."
}</code></pre><p>Now we run the query, append new results to the paragraph state and combine new results with the latest paragraph state.</p><p></p><h4>Part 7: Enriching latest paragraph state with reflection search results.</h4><p>After running the search query of reflection step:</p><pre><code>search_results = tavily_search("Recent research on Homo sapiens evolution, interaction with other human species, and factors contributing to their success")</code></pre><p>We can update the search state of the paragraph with:</p><pre><code>update_state_with_search_results(search_results, idx_paragraph, state)</code></pre><p>We now run steps 6. and 7. in a loop for a specified number of reflection steps.</p><p></p><h4>Part 8: Summarising and producing the report.</h4><p>We repeat the steps from <em>Parts 4 - 7</em> for each paragraph that was planned in <em>Part 2</em>. Once we have all of the final states in the paragraphs ready, we can then stitch the whole thing together. We will do that with an LLM and produce a nicely formatted MarkDown document. Here is the prompt:</p><pre><code>input_schema_report_formatting = {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "title": {"type": "string"},
                    "paragraph_latest_state": {"type": "string"}
            }
        }
    }

SYSTEM_PROMPT_REPORT_FORMATTING = f"""
You are a Deep Research assistant. You have already performed the research and constructed final versions of all paragraphs in the report.
You will get the data in the following json format:

&lt;INPUT JSON SCHEMA&gt;
{json.dumps(input_schema_report_formatting, indent=2)}
&lt;/INPUT JSON SCHEMA&gt;

Your job is to format the Report nicely and return it in MarkDown.
If Conclusion paragraph is not present, add it to the end of the report from the latest state of the other paragraphs.
"""</code></pre><p>Run:</p><pre><code>report_data = [{"title": paragraph.title, "paragraph_latest_state": paragraph.research.latest_summary} for paragraph in STATE.paragraphs]

response = client.chat.completions.create(
    model="DeepSeek-R1",
    messages=[{"role":"system","content": SYSTEM_PROMPT_REPORT_FORMATTING},
              {"role":"user","content":json.dumps(report_data)}],
    temperature=1
)

print(remove_reasoning_from_output(response.choices[0].message.content))</code></pre><p>And that&#8217;s it, you have yourself a Deep Researched report on a topic you provided.</p><h4></h4><h4>Conclusion.</h4><p>Congratulations! You have successfully implemented a Deep Research Agent from scratch. </p><p>If you want to take a look into a more properly implemented code, you can find it in my GitHub repository:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/deep_research_agent&quot;,&quot;text&quot;:&quot;GitHub Repository&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/deep_research_agent"><span>GitHub Repository</span></a></p><p>Also, there are many nuances to take into account that could make the system more stable:</p><ul><li><p>It is not easy to make the system produce consistently well formatted json outputs as reasoning models are known to be not the best at structured outputs.</p></li><li><p>Knowing the above, it might make sense to use different models for different tasks in the system topology, we really need a reasoning model for the first step mostly.</p></li><li><p>There are many improvements that could be implemented in how we search the web and how we rank the retrieved results.</p></li><li><p>Number of Reflection steps could be configured to be dynamic where the LLM could choose if more is needed.</p></li><li><p>We could return links that were used when searching the web and provide references in the report for each paragraph.</p></li><li><p>&#8230;</p></li></ul><p></p><p>If you want to follow SambaNova work, they have recently released some work on Deep Research agents as well. You can find more information <a href="https://sambanova.ai/blog/open-source-deep-research-agents">here</a>.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Simple way to explain Memory in AI Agents.]]></title><description><![CDATA[Win a NVIDIA RTX 4080 SUPER GPU!]]></description><link>https://www.newsletter.swirlai.com/p/simple-way-to-explain-memory-in-ai</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/simple-way-to-explain-memory-in-ai</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Wed, 26 Feb 2025 10:31:04 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/b4797f9b-8172-4885-ad37-16901f87227e_3614x2582.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>One of the most fulfilling aspects of being part of a community is being able to partner with various organisations with the goal of giving away exciting items.</p><p>This time I am partnering with NVIDIA to give away a <strong>NVIDIA RTX 4080 SUPER GPU</strong> to one of the SwirlAI community members.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!D1Zl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!D1Zl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 424w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 848w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 1272w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!D1Zl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png" width="500" height="522.3744292237443" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1144,&quot;width&quot;:1095,&quot;resizeWidth&quot;:500,&quot;bytes&quot;:312886,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157928657?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!D1Zl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 424w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 848w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 1272w, https://substackcdn.com/image/fetch/$s_!D1Zl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8e843ea8-c5f0-416f-b95c-f485804c4954_1095x1144.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><h4>In order to participate in the giveaway, you need to:</h4><div><hr></div><ul><li><p>Register to the GTC 2025 via the following link:</p><p></p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://nvda.ws/4h5H52g&quot;,&quot;text&quot;:&quot;Register to GTC 2025&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://nvda.ws/4h5H52g"><span>Register to GTC 2025</span></a></p><p></p></li><li><p>DM me to get further instructions.</p></li></ul><div><hr></div><p></p><p>The conference will be running throughout March 17th - 21st. It will be held in a hybrid format: San Jose, CA and virtually.</p><p><strong>Virtual attendance is FREE</strong> - so there&#8217;s no reason to miss out!</p><p>The lineup and the list of topics is really extensive. </p><p>Personally I am most excited about <strong>Robotics, IoT and AI Agents on the edge</strong>. These are the four sessions that I am adding to my attendance list:</p><ul><li><p><strong>An Introduction to NVIDIA Isaac GR00T for Humanoid Developers</strong> - excited to learn more about the platform for developing humanoid robots that NVIDIA is building.</p></li><li><p><strong>A New Era of Generalist Robotics: The Rise of Humanoids</strong> - excited to get the latest insights into how foundation models are used in humanoid robotics and what challenges are still to be overcome.</p></li><li><p><strong>Building Edge and Robotics Applications with Generative AI on NVIDIA Jetson</strong> - Sharpen my knowledge on latest developments in Agents on the edge and get inspiration for my own projects.</p></li><li><p><strong>AI Meets Robotics: European Startup Showcase</strong> - looking forward to hear more about where we stand with Robotics advancements in Europe.</p></li></ul><p>Be sure to subscribe to the Newsletter If you don&#8217;t want to miss any future giveaways or any other perks of being part of the community!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><div><hr></div><h3>Refresher: Simple explanation of Memory in Agentic Systems.</h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!lCFR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lCFR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 424w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 848w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 1272w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lCFR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif" width="1249" height="1220" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/af2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1220,&quot;width&quot;:1249,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1104801,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/gif&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/157928657?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lCFR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 424w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 848w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 1272w, https://substackcdn.com/image/fetch/$s_!lCFR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faf2f871d-cc5a-46da-b513-a3a743711aa2_1249x1220.gif 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic System Memory</figcaption></figure></div><p>In general, the memory for an agent is something that we provide via context in the prompt passed to LLM that helps the agent to better plan and react given past interactions or data not immediately available.<br><br>It is useful to group the memory into four types:</p><ol><li><p><strong>Episodic</strong> - This type of memory contains past interactions and actions performed by the agent. After an action is taken, the application controlling the agent would store the action in some kind of persistent storage so that it can be retrieved later if needed. A good example would be using a vector Database to store semantic meaning of the interactions.</p></li><li><p><strong>Semantic</strong> - Any external information that is available to the agent and any knowledge the agent should have about itself. You can think of this as a context similar to one used in RAG applications. It can be internal knowledge only available to the agent or a grounding context to isolate part of the internet scale data for more accurate answers.</p></li><li><p><strong>Procedural</strong> - This is systemic information like the structure of the System Prompt, available tools, guardrails etc. It will usually be stored in Git, Prompt and Tool Registries.</p></li><li><p>Occasionally, the agent application would pull information from long-term memory and store it locally if it is needed for the task at hand.</p></li><li><p>All of the information pulled together from the long-term or stored in local memory is called short-term or working memory. Compiling all of it into a prompt will produce the prompt to be passed to the LLM and it will provide further actions to be taken by the system.</p></li></ol><p>We usually label 1. - 3. as Long-Term memory and 5. as Short-Term memory.</p><p>And that is it! The rest is all about how you architect the topology of your Agentic Systems.</p><p>Stay safe and hope to see you in the next Newsletter episode. We will continue building Agentic Systems from scratch!</p><div><hr></div><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUyNTA4NjQ2LCJpYXQiOjE3NDAzOTYwMTIsImV4cCI6MTc0Mjk4ODAxMiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.hEuIF3u-FXWjvF5Xda_3RqV_HzkhNno-5C5ykS73wUo&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUyNTA4NjQ2LCJpYXQiOjE3NDAzOTYwMTIsImV4cCI6MTc0Mjk4ODAxMiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.hEuIF3u-FXWjvF5Xda_3RqV_HzkhNno-5C5ykS73wUo"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p>]]></content:encoded></item><item><title><![CDATA[Data Pipelines in Machine Learning Systems.]]></title><description><![CDATA[A hands-on tutorial of real time web data ingestion pipeline followed by Apache Spark based ETL.]]></description><link>https://www.newsletter.swirlai.com/p/data-pipelines-in-machine-learning</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/data-pipelines-in-machine-learning</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Mon, 24 Feb 2025 12:01:06 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/27e4bd56-08f7-4478-b666-e1e47ee8b49a_4248x2213.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>AI and AI Agents are the topic of 2025. However, we often overlook the crucial element that drives these applications - Data.</p><p><strong>Data Pipelines in Machine Learning Systems</strong> can become complex and for a good reason.</p><p>It is critical to ensure Data Quality and Integrity upstream of ML Training and Inference Pipelines, trying to do that in the downstream systems will cause unavoidable failure when working at scale.</p><p>Example architecture for a production grade end-to-end data flow:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kbvd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kbvd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 424w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 848w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 1272w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kbvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png" width="584" height="592.8241758241758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14dec39c-f9ef-485b-9c0a-81e9ef59b385_3648x3704.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1478,&quot;width&quot;:1456,&quot;resizeWidth&quot;:584,&quot;bytes&quot;:1395979,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14dec39c-f9ef-485b-9c0a-81e9ef59b385_3648x3704.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kbvd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 424w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 848w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 1272w, https://substackcdn.com/image/fetch/$s_!kbvd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d66eb91-825f-40d2-8e19-48a1330d913d_3648x3704.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Data Pipeline in Machine Learning System</figcaption></figure></div><ol><li><p>Schema changes are implemented in version control, once approved - they are pushed to the Applications generating the Data, Databases holding the Data and a central Data Contract Registry.</p></li><li><p>Events emitted directly by the Application Services or via CDC to Kafka topics.</p></li><li><p>A Flink Application(s) consumes Data from Raw Data streams and validates it against schemas in the Contract Registry.</p></li><li><p>Data that does not meet the contract is pushed to Dead Letter Topic.</p></li><li><p>Data that meets the contract is pushed to Validated Data Topic.</p></li><li><p>Data from the Validated Data Topic is pushed to object storage for additional Validation.</p></li><li><p>On a schedule Data in the Object Storage is validated against additional SLAs in Data Contracts and is pushed to the Data Warehouse to be Transformed and Modeled for Analytical purposes.</p></li><li><p>Modeled and Curated data is pushed to the Feature Store System for further Feature Engineering.</p></li><li><p>High Quality Data is used in Machine Learning Training Pipelines.</p></li><li><p>The same Data is used for Feature Serving in Inference.</p><p></p></li></ol><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><p>In this hands-on project Newsletter episode I am putting on my Data Engineer hat and we will implement a simplified real time data ingestion pipeline that represents a pipeline that you would see in production environments in real world:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!aBCb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!aBCb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 424w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 848w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 1272w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!aBCb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png" width="448" height="618.1538461538462" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f1c066ff-abcc-4adc-8b97-1a6f3d8ebd3a_2180x3008.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:2009,&quot;width&quot;:1456,&quot;resizeWidth&quot;:448,&quot;bytes&quot;:481384,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff1c066ff-abcc-4adc-8b97-1a6f3d8ebd3a_2180x3008.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!aBCb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 424w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 848w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 1272w, https://substackcdn.com/image/fetch/$s_!aBCb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F117cf8ae-f071-42c0-8131-e12b5bc6b21e_2180x3008.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p>Data producers sending data to the collector service in real time.</p></li><li><p>Real time data ingestion via a collector service.</p></li><li><p>Batching and writing this data in object storage.</p></li><li><p>Raw data staged for further processing.</p></li><li><p>Batch Spark job that cleans, deduplicates and potentially enriches raw data.</p></li><li><p>Scheduled Airflow task that runs the ETL every N minutes.</p><p></p></li></ol><p>I will explain why each part of the infrastructure is a good idea as we go through the project.</p><p>You can find all of the code to support you in <a href="https://github.com/AurimasGr/sai-nebius-spark">this</a> GitHub repository.</p><p>If you do run into any problems while following the project, let me know in the comment section or drop me a PM, we will solve it together.</p><p></p><div><hr></div><p>This newsletter episode was made possible by Nebius - AI Cloud provider that has recently graduated to become one of the 4 AI Cloud giants. Congratulations!</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Psk6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" width="460" height="87.4" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:190,&quot;width&quot;:1000,&quot;resizeWidth&quot;:460,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Online classes - nebius&quot;,&quot;title&quot;:&quot;Online classes - nebius&quot;,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Online classes - nebius" title="Online classes - nebius" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Consider registering if you want to follow the tutorial, all of the pieces will deployed on Nebius platform. We will also use a Managed Service for Apache Spark service provided by Nebius.</p><p>Managed Service for Apache Spark is in a preview stage right now which means that you can run the clusters for free!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://nebius.com/&quot;,&quot;text&quot;:&quot;Register&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://nebius.com/"><span>Register</span></a></p><div><hr></div><p></p><h3>Lets go build.</h3><p></p><p>We will start by implementing the Real Time Ingestion pipeline:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!sdqL!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!sdqL!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 424w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 848w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!sdqL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png" width="1456" height="586" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b227f9d1-c3f9-43da-8efd-6a20417e5dad_3123x1256.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:586,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:219596,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb227f9d1-c3f9-43da-8efd-6a20417e5dad_3123x1256.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!sdqL!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 424w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 848w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 1272w, https://substackcdn.com/image/fetch/$s_!sdqL!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F38b7e047-d607-4419-9168-dce3fe1d1810_3123x1256.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p><strong>The Collector.</strong></p><ul><li><p>Write a Python application that uses FastAPI Framework to expose a REST API endpoint.</p></li><li><p>Deploy the application on Kubernetes.</p></li><li><p>Horizontally scale the application for High Availability</p></li></ul></li><li><p><strong>Data Producer.</strong></p><ul><li><p>Write a Python application to download data from the internet.</p></li><li><p>Run the application on Kubernetes.</p></li><li><p>Send the downloaded data to a previously deployed REST API endpoint.</p></li></ul></li></ul><p></p><h4><strong>Prerequisites.</strong></h4><p>This tutorial assumes that you know the basics of Python and have skimmed through my latest Guide to Kubernetes which you can find <a href="https://www.newsletter.swirlai.com/p/a-guide-to-kubernetes-part-1">here</a>.</p><p></p><h3><strong>Defining the Collector architecture.</strong></h3><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/the-swirlai-data-engineering-project-9fe?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTM2NDA5Mjc2LCJpYXQiOjE3NDAwOTQ3MjEsImV4cCI6MTc0MjY4NjcyMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.wr0yrJi0tTzpKNlws_x7spSSddzsCvbjBK7ZwbPzVcA&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/the-swirlai-data-engineering-project-9fe?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTM2NDA5Mjc2LCJpYXQiOjE3NDAwOTQ3MjEsImV4cCI6MTc0MjY4NjcyMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.wr0yrJi0tTzpKNlws_x7spSSddzsCvbjBK7ZwbPzVcA"><span>Share</span></a></p><p></p><p>The following is the first version of The Collector that we will be implementing:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TLqv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TLqv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 424w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 848w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 1272w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TLqv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png" width="1456" height="1511" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d258b2c4-89b8-4e65-a995-f38294d0ab1d_3190x3310.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1511,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:792451,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd258b2c4-89b8-4e65-a995-f38294d0ab1d_3190x3310.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TLqv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 424w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 848w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 1272w, https://substackcdn.com/image/fetch/$s_!TLqv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F91827470-f221-4374-8ad8-86aaddbd8e21_3190x3310.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Collector will be a Highly Available REST API application implemented with FastAPI Framework.</p></li><li><p>Exposed REST API endpoints will be able to accept events that meet specific schema requirements - we will check for top level field existence. These top level fields will allow us to validate Data Contracts later on in the Downstream Processes.</p></li><li><p>Fields that we will be expecting during validation:</p><ul><li><p><strong>event_type</strong> - this will be needed to identify the source that generated the event.</p></li><li><p><strong>schema_version</strong> - this field allows us to identify which version of the schema of the given data source is represented in the event. This data will be used for schema evolution.</p></li><li><p><strong>payload</strong> - actual data of interest that is relevant for analytics purposes.</p></li></ul></li><li><p>Collector application will add additional fields on top of the already existing ones:</p><ul><li><p><strong>collector_timestamp</strong>: timestamp when the collector has processed the event for downstream tasks.</p></li><li><p><strong>root_id</strong> - a unique identifier for the event in the system (uuid).</p></li><li><p><strong>collector_id</strong> - identifier of the collector process used for debugging purposes when there is the need to know where specifically the event was processed before emitting it to the downstream systems..</p></li></ul></li><li><p>In this tutorial we will not use distributed messaging systems like Kafka and rather write batched events directly to Object Storage from the Collector.</p></li><li><p>We will face the Collector applications with a Load Balancer for High Availability purposes.</p></li><li><p>We will emulate the Data Producers by downloading batch datasets from the internet and emitting records of these datasets to the Collector applications one by one every n milliseconds/seconds.</p><ul><li><p>We will use a popular TLC Trip Record Dataset available for download <a href="https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page">here</a>.</p></li><li><p>To emulate distributed systems and multiple data sources we will download data for November of 2024 of both Yellow and Green taxi datasets and emit them in parallel using separate applications.</p></li></ul></li><li><p>Collectors are the most sensitive part of your pipeline - you want to do as little processing and data manipulation here as possible. Usually, any errors here would result in permanent Data Loss (naturally less processing on the Data Producer side is also desirable as it would suffer from the same constraints). Once the data is already in the Downstream System you can always retry additional computation.</p></li><li><p>Hence, we decouple Collectors from any additional Schema Validation or Enrichment of the Payload and move that down the stream.</p></li></ul><h4><strong>How does it all look from Kubernetes' perspective?</strong></h4><p>As mentioned earlier, our goal in this hands-on tutorial is to deploy everything on Nebius AI Cloud. The first set of applications will be deployed on Kubernetes</p><p>Collectors and Producers very naturally map to the Kubernetes resources:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ve11!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ve11!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 424w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 848w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 1272w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ve11!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png" width="1456" height="1293" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5336bade-4a9b-4ae6-9148-46fd8eb3144f_3843x3412.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1293,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:878672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5336bade-4a9b-4ae6-9148-46fd8eb3144f_3843x3412.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ve11!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 424w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 848w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 1272w, https://substackcdn.com/image/fetch/$s_!Ve11!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb55d31fb-a9b0-4d74-b304-3db4e9773cc2_3843x3412.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>We will be deploying all applications in a single namespace called <em>swirlai</em>.</p></li><li><p>Collector applications will be in a form of deployment called <em>collector</em>.</p></li><li><p>We can scale replicas of the collector to as many as we will need.</p></li><li><p>We will mount a service named <em>collector </em>on top of the pods created by the collector deployment and map to the ports 80 of the containers managed by pods. The service itself will expose port 80.</p></li><li><p>Producer applications will be deployed in a form of separate pods per dataset. We will go for 2 pods named <em>producer-1</em> and <em>producer-2</em> that will download datasets for November of Grren Taxi and Yellow Taxi data respectively.</p></li><li><p>Since all applications will be deployed in the same namespace, we will be able to access service directly by its name <em>collector.</em></p></li><li><p>We will be exposing the collection endpoint under path <em>/api/v1/collect</em> and accept post requests to it.</p></li><li><p>Eventually, we will be sending events to collector applications via post requests to the <a href="http://collector/api/v1/collect">http://collector/api/v1/collect</a> endpoint.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/the-swirlai-data-engineering-project-9fe?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTM2NDA5Mjc2LCJpYXQiOjE3NDAwOTQ3MjEsImV4cCI6MTc0MjY4NjcyMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.wr0yrJi0tTzpKNlws_x7spSSddzsCvbjBK7ZwbPzVcA&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/the-swirlai-data-engineering-project-9fe?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTM2NDA5Mjc2LCJpYXQiOjE3NDAwOTQ3MjEsImV4cCI6MTc0MjY4NjcyMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.wr0yrJi0tTzpKNlws_x7spSSddzsCvbjBK7ZwbPzVcA"><span>Share</span></a></p><h3><strong>Implementing the Collector Application.</strong></h3><p>Reminder - you can find the code used in this Newsletter episode in one of my GitHub repositories <a href="https://github.com/AurimasGr/sai-nebius-spark">here</a>.</p><p>Before we can deploy anything, we will need a Kubernetes cluster, Object Storage and service account configured with Object Storage access.</p><p></p><h4>Creating Kubernetes Cluster.</h4><p>If you haven&#8217;t yet, create Nebius account <a href="https://nebius.com/">here</a>.</p><p>You can easily create Kubernetes clusters via Nebius UI. </p><p>Before you can start the deployment, you will need to install Nebius CLI tool. If you already have your account set up it should be straightforward.</p><p>Assuming that you are running on MacOS, run:</p><pre><code><code>brew install jq
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/darwin/arm64/kubectl"
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
sudo chown root: /usr/local/bin/kubectl
curl -sSL https://storage.ai.nebius.cloud/nebius/install.sh | bash</code></code></pre><p>This will set you up with kubectl tool for communication with Kubernetes cluster and nebius CLI tool for authentication with. After this run:</p><pre><code><code>nebius profile create</code></code></pre><p>It will prompt you for:</p><ul><li><p>Name - enter any.</p></li><li><p>Api endpoint - leave the default <em>api.eu.nebius.cloud.</em></p></li><li><p>Authorisation type - choose <em>federation.</em></p></li></ul><p>After the above you will be redirected to the browser window where you will be authenticated. You now have your Nebius CLI tool set up.</p><p>If anything fails at this point, refer to official Nebius documentation here: <a href="https://docs.nebius.com/kubernetes/quickstart/#env-install">Link</a>.</p><p>For simplicity reasons, we will perform the deployment via the Nebius Cloud UI, here is how you can do it. It is easy to figure it out, but let&#8217;s run step-by-step so that there are no unanswered questions.</p><ul><li><p>Once you log in to the console, click on the <em>&#8220;Managed Kubernetes&#8221; </em>tab on the left and click <em>&#8220;+ Create cluster&#8221;</em> top right</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_sF4!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_sF4!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 424w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 848w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 1272w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_sF4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png" width="1456" height="344" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:344,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:299672,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_sF4!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 424w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 848w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 1272w, https://substackcdn.com/image/fetch/$s_!_sF4!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4067fd61-37c7-4acf-8789-55c8502258a7_2924x690.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ul><li><p>Once in the cluster creation section:</p><ul><li><p>Provide the cluster name.</p></li><li><p>We switch off the <em>Control plane high availability</em> off as there is no need for it in a demo project, be sure to always have it turned on for production use cases.</p></li><li><p>Let&#8217;s have the <em>Public endpoint</em> on as it will make configuration of kubectl easier for this example.</p></li><li><p>Press <em>&#8220;Create cluster&#8221; </em>once configuration is complete.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TPyH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TPyH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 424w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 848w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 1272w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TPyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png" width="1456" height="744" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:744,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:886059,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TPyH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 424w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 848w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 1272w, https://substackcdn.com/image/fetch/$s_!TPyH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb4770093-666c-4eb7-9a15-19bfe53e0425_2920x1492.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>You will see a new cluster being provisioned in the <em>&#8220;Managed Kubernetes&#8221;</em> overview tab. Click on it.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!TYCZ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!TYCZ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 424w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 848w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 1272w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!TYCZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png" width="1456" height="354" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:354,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:345924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!TYCZ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 424w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 848w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 1272w, https://substackcdn.com/image/fetch/$s_!TYCZ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F2bf026e2-df64-4f8e-b806-8875bb7dbae9_2920x710.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><ul><li><p>The above steps have provisioned a control plane for K8s, now we need to add some worker nodes. Click on the &#8220;<em>Node groups&#8221; </em>tab and click <em>&#8220;+ Create new group&#8221;</em>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!28zp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!28zp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 424w, https://substackcdn.com/image/fetch/$s_!28zp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 848w, https://substackcdn.com/image/fetch/$s_!28zp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 1272w, https://substackcdn.com/image/fetch/$s_!28zp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!28zp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png" width="1456" height="627" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:627,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:581877,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!28zp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 424w, https://substackcdn.com/image/fetch/$s_!28zp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 848w, https://substackcdn.com/image/fetch/$s_!28zp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 1272w, https://substackcdn.com/image/fetch/$s_!28zp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb07c9643-bc40-4c0c-bf7a-0ac9579f5ed0_2920x1258.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Configure the Node Group for our project:</p><ul><li><p>Give it a <em>Name</em>.</p></li><li><p>Disable the <em>Public IPv4 address </em>as we will be using LoadBalancer services to expose our apps, no need for the node itself to have a public address.</p></li><li><p>Keep <em>Number of nodes</em> at 3 as we will run cheap CPU based ones for multipple applications.</p></li><li><p>In the <em>Preset</em> field be sure to select 2 CPUs - 8 GiB RAM, we will not need more.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xpy6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 424w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 848w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png" width="969" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:969,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:176004,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 424w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 848w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>That&#8217;s it, we are ready to connect to our new Kubernetes cluster. Click on the <em>&#8220;How to connect&#8221; </em>button, copy the third command and run it. We are good to go to use kubectl and communicate with the cluster. Let&#8217;s try.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kOAO!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kOAO!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 424w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 848w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kOAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png" width="1456" height="798" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:798,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:705140,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kOAO!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 424w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 848w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 1272w, https://substackcdn.com/image/fetch/$s_!kOAO!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3929717-8dae-4f90-a90a-d62881a57848_2920x1600.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Run the following in the terminal</p><pre><code><code>kubectl get pods</code></code></pre><p>you should see empty list.</p><p></p><h4>Creating Buckets.</h4><p>For this tutorial, you will need 4 buckets to store the data.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yzVp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yzVp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 424w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 848w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 1272w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yzVp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png" width="1456" height="417" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3051becf-e627-4c45-af15-6ad70627133b_3015x863.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:417,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:443070,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yzVp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 424w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 848w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 1272w, https://substackcdn.com/image/fetch/$s_!yzVp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3051becf-e627-4c45-af15-6ad70627133b_3015x863.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>The creation of the buckets is trivial, you simply need to name them, the defaults for other configurations will do just fine. Feel free to replace <em>sai- </em>with your prefix just have in mind that you need to replace the names in two scripts down the line: </p><ul><li><p>Collector application via deployment configuration.</p><pre><code> - name: RAW_LANDING_BUCKET
   value: sai-raw</code></pre></li><li><p>Airflow DAG script.</p><pre><code>RAW_LANDING_BUCKET = 'sai-raw'
RAW_PROCESSING_BUCKET = 'sai-processing'
PROCESSED_BUCKET = 'sai-processed'
ARCHIVE_BUCKET = 'sai-archive'</code></pre></li></ul><p></p><h4>Creating Service Account.</h4><p>I will not be able to explain it better than it is done <a href="https://docs.nebius.com/object-storage/quickstart">here</a> :) Just follow the instructions step by step.</p><p><strong>[IMPORTANT]:</strong> We will need the following values from this procedure later, so be sure to save them:</p><ul><li><p>NB_ACCESS_KEY_AWS_ID.</p></li><li><p>NB_SECRET_ACCESS_KEY.</p></li><li><p>AWS_DEFAULT_REGION: eu-north1.</p></li><li><p>AWS_ENDPOINT_URL: https://storage.eu-north1.nebius.cloud:443</p></li></ul><p>I am assuming that you will be running the tutorial in <em>eu-north1</em> region.</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>The Code.</h4><p>Below is the screenshot of the structure of the Python code for the Collector application. Let&#8217;s dig deeper:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z5u8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z5u8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 424w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 848w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 1272w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z5u8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png" width="594" height="692.728021978022" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc31b735-187b-420e-a68b-f77726c9d83c_4849x5654.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1698,&quot;width&quot;:1456,&quot;resizeWidth&quot;:594,&quot;bytes&quot;:1713696,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc31b735-187b-420e-a68b-f77726c9d83c_4849x5654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z5u8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 424w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 848w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 1272w, https://substackcdn.com/image/fetch/$s_!z5u8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F14182774-2c82-42d4-aa75-0bb4100c52da_4849x5654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p><strong>FastAPI Application entrypoint.</strong></p></li></ol><pre><code><code>from fastapi import FastAPI
from schema.schema import TopLevelSchema
from helpers.processor import Processor, Buffer

import config.logger as logger
import config.config as config
import os


logger = logger.setup_logger()
conf = getattr(config, f'{os.environ["APP_ENV"].title()}Config')

app = FastAPI(docs_url=f"/api/{conf.V_API}/docs",
              redoc_url=f"/api/{conf.V_API}/redoc",
              openapi_url=f"/api/{conf.V_API}/openapi.json")

buffer = Buffer()

@app.get(f"/api/{conf.V_API}")
async def root():

    return {"message": f"Welcome to the collector {os.environ['HOSTNAME']}!"}


@app.post(f"/api/{conf.V_API}{conf.COLLECTOR_PATH}")
async def write_data(data: TopLevelSchema):

    try:
        processed_data = Processor(data.dict()).processed_event
        buffer.add(processed_data)
        logger.info(f"Adding event to buffer: {data.dict()}")
    except:
        logger.error(f"Failed to process event: {data}")

    return {"message": f"Event added to buffer"}</code></code></pre><p>Take note that we are versioning the API endpoint as per</p><pre><code><code>/api/{conf.V_API}</code></code></pre><p>It is an important good practice in REST API development as you might want to evolve the contract of it but keep the old version available for compatibility reasons. We store the version in configuration object.</p><p>As per</p><pre><code><code>from helpers.processor import Processor</code></code></pre><p>we are implementing all of the processing logic in a separate script located in helpers folder.</p><ol start="2"><li><p><strong>Any additional helper functions or Classes.</strong></p><p></p><p>2.1. We implement processing logic that needs to be applied on the incoming events via Processor class. Buffer class contains logic of buffering the events and flushing them to the object storage after specific thresholds. We go simple for this example, when the buffer reaches 50 events, they get automatically flushed to object storage.</p></li></ol><pre><code><code>from datetime import datetime
import pytz
import uuid
import os
import json
import boto3


class Processor:

    def __init__(self, event: dict):

        self.event = event

    @staticmethod
    def process_event(event: dict) -&gt; dict:

        def _add_timestamp(record: dict) -&gt; dict:

            record['collector_tstamp'] = datetime.now(tz=pytz.UTC).strftime("%Y-%m-%d %H:%M:%S %z")

            return record

        def _add_id(record: dict) -&gt; dict:

            record['root_id'] = str(uuid.uuid4())

            return record

        def _add_collector_id(record: dict) -&gt; dict:

            record['collector_id'] = os.environ['HOSTNAME']

            return record

        processed_event = _add_timestamp(event)
        processed_event = _add_id(processed_event)
        processed_event = _add_collector_id(processed_event)

        return processed_event

    @property
    def processed_event(self) -&gt; dict:

        return self.process_event(self.event)


class Buffer:

    BUFFER_SIZE = 50
    BUCKET_NAME = os.environ["RAW_LANDING_BUCKET"]

    def __init__(self):

        self.buffer = []
        self.s3 = boto3.client('s3')

    def add(self, event: dict):

        self.buffer.append(event)

        if len(self.buffer) &gt;= self.BUFFER_SIZE:
            self.flush()

    def clear(self):

        self.buffer = []

    def flush(self):

        self.s3.put_object(
            Body=json.dumps(self.buffer),
            Bucket=self.BUCKET_NAME,
            Key=f'{datetime.now(tz=pytz.UTC).strftime("%Y-%m-%d-%H-%M-%S-%f")}.json'
        )

        self.clear()</code></code></pre><p>As discussed while defining the architecture of the applications, the only processing is adding of 3 additional fields: <em>collector_timestamp, root_id</em> and <em>collector_id</em>.</p><ol start="3"><li><p><strong>Schema definition.</strong></p><p></p><p>3.1. We want to decouple schema definition into a separate script from the main entrypoint of the FastAPI application for cleaner structure.</p></li></ol><pre><code><code>from pydantic import BaseModel


class TopLevelSchema(BaseModel):
    event_type: str
    schema_version: str
    payload: dict</code></code></pre><p>It is really minimal pydantic data model. This is also where one of the main FastAPI advantages comes in. Any field defined in TopLevelSchema including the type is required and validated against any incoming event to the endpoint.</p><p>Referring back to app.py:</p><pre><code><code>@app.post(f"/api/{conf.V_API}{conf.COLLECTOR_PATH}")
async def write_data(data: TopLevelSchema):</code></code></pre><p>we are expecting <em>data</em> with schema <em>TopLevelSchema</em> for any data posted to <em>/api/{conf.V_API}{conf.COLLECTOR_PATH} </em>path (this will equal to <em>/api/v1/collect</em> in our example).</p><ol start="4"><li><p><strong>Additional application level configuration:</strong></p><p></p><p>4.1. General application configuration.</p></li></ol><pre><code><code>class BaseConfig:

    COLLECTOR_PATH = "/collect"
    V_API = "v1"
    TESTING = False
    DEBUG = False


class DevConfig(BaseConfig):

    DEBUG = True


class ProdConfig(BaseConfig):

    ...


class TestConfig(BaseConfig):

    ...</code></code></pre><p>This is also where we define a collection endpoint and current api version.</p><p>4.2. <strong>Logging configuration.</strong></p><pre><code><code>import logging
import os

LOG_FORMAT = f'[%(asctime)s]: {os.getpid()} %(levelname)s %(message)s'
DATE_FORMAT = '%Y-%m-%d %H:%M:%S'
HANDLERS = [logging.StreamHandler()]
LOG_LEVEL = logging.DEBUG


def setup_logger() -&gt; logging.Logger:

    logging.basicConfig(level=LOG_LEVEL,
                        format=LOG_FORMAT,
                        datefmt=DATE_FORMAT,
                        handlers=HANDLERS)

    return logging.getLogger()</code></code></pre><p>This holds some global logging configuration that you can also parametrise according to your environment.</p><ol start="5"><li><p><strong>Python library requirements.</strong></p></li></ol><pre><code><code>fastapi~=0.95.1
pydantic~=1.10.7
uvicorn
pytz
boto3</code></code></pre><p>As you can see, for this example the requirements are extremely minimal as we are not doing any fancy processing inside of the collector and rather treating it as a proxy.</p><ol start="6"><li><p><strong>Dockerfile for the application container.</strong></p></li></ol><pre><code><code>FROM python:3.9

WORKDIR /code/src

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./src /code/src

CMD ["uvicorn", "app:app", "--host", "0.0.0.0", "--port", "80"]</code></code></pre><p>The Dockerfile is also very minimal for now.</p><p><strong>Important:</strong></p><ul><li><p>This is not a production ready Dockerfile as there are many security considerations you should do (we will look into how to build a secure docker container in one of the future Newsletter episodes). For simplicity sake we will not implement them today.</p></li></ul><p>In order to build the container that contains your collector application, go to the folder where Dockerfile resides and run, we will be pushing the container to your docker hub registry, so you will need to change some values in the bellow command:</p><pre><code><code>docker build . --platform linux/amd64 -t &lt;docker_hub_account&gt;/&lt;repository&gt;:&lt;tag&gt;</code></code></pre><p>Once the container is built, push it to your docker registry so that it can be pulled to the K8s cluster:</p><pre><code><code>docker push &lt;docker_hub_account&gt;/&lt;repository&gt;:&lt;tag&gt;</code></code></pre><p>If you want to just follow the tutorial, feel free to use the image I&#8217;ve built, the latest one can be found under:</p><pre><code>aurimasg/collector:0.2.0</code></pre><ol start="7"><li><p><strong>Kubernetes resource manifests that will be needed to deploy Collector application.</strong></p><p></p><p>7.1. Namespace.</p></li></ol><pre><code><code>apiVersion: v1
kind: Namespace
metadata:
   name: swirlai</code></code></pre><p>You can create the namespace by running:</p><pre><code><code>kubectl apply -f namespace.yaml</code></code></pre><p>7.2. Secrets.</p><p>Remember when we created </p><ul><li><p>NB_ACCESS_KEY_AWS_ID.</p></li><li><p>NB_SECRET_ACCESS_KEY.</p></li></ul><p>these are equivalent to AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY respectively. You will need to base64 encode them and add the values to the secrets file.</p><pre><code>apiVersion: v1
kind: Secret
metadata:
  name: aws-secret
  namespace: swirlai
type: Opaque
data:
  AWS_ACCESS_KEY_ID: &lt;base64 encoded value of AWS_ACCESS_KEY_ID&gt;
  AWS_SECRET_ACCESS_KEY: &lt;base64 encoded value of AWS_SECRET_ACCESS_KEY&gt;</code></pre><p>7.3. Deployment.</p><pre><code><code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: collector
  labels:
    app: collector
  namespace: swirlai
spec:
  replicas: 2
  selector:
    matchLabels:
      app: collector
  template:
    metadata:
      labels:
        app: collector
    spec:
      containers:
      - name: collector
        image: aurimasg/collector:0.2.0
        imagePullPolicy: Always
        ports:
        - containerPort: 80
        env:
          - name: APP_ENV
            value: Dev
          - name: RAW_LANDING_BUCKET
            value: sai-raw
          - name: AWS_ACCESS_KEY_ID
            valueFrom:
              secretKeyRef:
                name: aws-secret
                key: AWS_ACCESS_KEY_ID
          - name: AWS_SECRET_ACCESS_KEY
            valueFrom:
              secretKeyRef:
                name: aws-secret
                key: AWS_SECRET_ACCESS_KEY
          - name: AWS_DEFAULT_REGION
            value: eu-north1
          - name: AWS_ENDPOINT_URL
            value: https://storage.eu-north1.nebius.cloud:443</code></code></pre><p>You can create the deployment by running:</p><pre><code><code>kubectl apply -f deployment.yaml</code></code></pre><p>7.4. Service.</p><pre><code><code>apiVersion: v1
kind: Service
metadata:
  name: collector
  namespace: swirlai
spec:
  selector:
    app: collector
  ports:
  - name: collector
    protocol: TCP
    port: 80
    targetPort: 80</code></code></pre><p>You can create the service by running:</p><pre><code><code>kubectl apply -f service.yaml</code></code></pre><p>That&#8217;s it for collector, it should be successfully running in the swirlai namespace:</p><pre><code>kubectl get pods -n swirlai</code></pre><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3><strong>Implementing the Producer Applications.</strong></h3><p>We will not spend much time on Producer application as it is really simple, it just downloads data from the internet and pushes it to the collector service.</p><p>Below is the screenshot of the structure of the Python project for the Producer application.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fps-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fps-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 424w, https://substackcdn.com/image/fetch/$s_!fps-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 848w, https://substackcdn.com/image/fetch/$s_!fps-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 1272w, https://substackcdn.com/image/fetch/$s_!fps-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fps-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png" width="558" height="560.2994505494505" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a8891214-1d13-461e-a098-21b15cfd9cc2_2939x2952.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1462,&quot;width&quot;:1456,&quot;resizeWidth&quot;:558,&quot;bytes&quot;:653256,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa8891214-1d13-461e-a098-21b15cfd9cc2_2939x2952.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fps-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 424w, https://substackcdn.com/image/fetch/$s_!fps-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 848w, https://substackcdn.com/image/fetch/$s_!fps-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 1272w, https://substackcdn.com/image/fetch/$s_!fps-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fae862c3e-d041-4b1e-9669-90996050433e_2939x2952.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ol><li><p><strong>The entrypoint of the application.</strong></p></li></ol><pre><code><code>import requests
import pandas as pd
import time
import config.logger as logger
import argparse
import os
import json


SOURCE_URL = os.environ['DATA_URL']
EVENT_TYPE = os.environ['EVENT_TYPE']
SCHEMA_VERSION = os.environ['SCHEMA_VERSION']
COLLECTOR_URL = os.environ['COLLECTOR_URL']

logger = logger.setup_logger()


def main(sleep_time: float, max_record_count: int) -&gt; None:

    df = pd.read_parquet(SOURCE_URL)

    if max_record_count == 0:
        max_record_count = len(df)

    record_counter = 0

    top_level_fields = {'event_type': EVENT_TYPE,
                        'schema_version': SCHEMA_VERSION}

    for index, row in df.iterrows():

        try:
            json_header = {'Content-Type': 'application/json'}
            url = COLLECTOR_URL
            row_data = row.to_dict()
            if EVENT_TYPE == 'YellowTaxiTripRecords':
                row_data['tpep_pickup_datetime'] = row_data['tpep_pickup_datetime'].strftime('%Y-%m-%d %H:%M:%S')
                row_data['tpep_dropoff_datetime'] = row_data['tpep_dropoff_datetime'].strftime('%Y-%m-%d %H:%M:%S')
            else:
                row_data['lpep_pickup_datetime'] = row_data['lpep_pickup_datetime'].strftime('%Y-%m-%d %H:%M:%S')
                row_data['lpep_dropoff_datetime'] = row_data['lpep_dropoff_datetime'].strftime('%Y-%m-%d %H:%M:%S')
            processed_event = {**top_level_fields, "payload": row_data}
            logger.info(requests.post(url, data=json.dumps(processed_event), headers=json_header))
        except:
            logger.error(f'Failed to process payload')

        record_counter += 1
        time.sleep(sleep_time)

        if record_counter &gt;= max_record_count:
            break


if __name__ == "__main__":

    parser = argparse.ArgumentParser(description="Runs producer which writes TLCTripRecordData to collector endpoint")
    parser.add_argument('-s', '--sleep_time', required=False, default='0.5')
    parser.add_argument('-m', '--max_record_count', required=False, default='10000')
    args = parser.parse_args()

    logger.info(f'Starting sample producer (TLCTripRecordData) with {args.sleep_time} s between sending records')

    main(float(args.sleep_time), int(args.max_record_count))</code></code></pre><p>Note the</p><pre><code><code>SOURCE_URL = os.environ['DATA_URL']
EVENT_TYPE = os.environ['EVENT_TYPE']
SCHEMA_VERSION = os.environ['SCHEMA_VERSION']
COLLECTOR_URL = os.environ['COLLECTOR_URL']

&lt;...&gt;

df = pd.read_parquet(SOURCE_URL)</code></code></pre><p>We will be passing location to the NYS cab datasets together with other configurations via environmental variables which we will configure when creating Pods in Kubernetes.</p><ol start="2"><li><p><strong>Logger options that are the same as in the Collector application.</strong></p></li></ol><pre><code><code>import logging
import os

LOG_FORMAT = f'[%(asctime)s]: {os.getpid()} %(levelname)s %(message)s'
DATE_FORMAT = '%Y-%m-%d %H:%M:%S'
HANDLERS = [logging.StreamHandler()]
LOG_LEVEL = logging.DEBUG


def setup_logger() -&gt; logging.Logger:

    logging.basicConfig(level=LOG_LEVEL,
                        format=LOG_FORMAT,
                        datefmt=DATE_FORMAT,
                        handlers=HANDLERS)

    return logging.getLogger()</code></code></pre><ol start="3"><li><p><strong>Python requirements for the environment.</strong></p></li></ol><pre><code><code>pandas
pyarrow
requests</code></code></pre><ol start="4"><li><p><strong>Dockerfile for building the application container.</strong></p></li></ol><pre><code><code>FROM python:3.9

WORKDIR /code/src

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./src /code/src

CMD ["python", "app.py"]</code></code></pre><ol start="5"><li><p><strong>Kubernetes Pod definition manifest.</strong></p></li></ol><pre><code><code>apiVersion: v1
kind: Pod
metadata:
  name: producer-1
  namespace: swirlai
spec:
  containers:
  - name: producer
    image: aurimasg/producer-yt:0.2.0
    imagePullPolicy: Always
    env:
    - name: DATA_URL
      value: https://d37ci6vzurychx.cloudfront.net/trip-data/yellow_tripdata_2024-11.parquet
    - name: EVENT_TYPE
      value: YellowTaxiTripRecords
    - name: SCHEMA_VERSION
      value: 0-1-0
    - name: COLLECTOR_URL
      value: http://collector/api/v1/collect

---

apiVersion: v1
kind: Pod
metadata:
  name: producer-2
  namespace: swirlai
spec:
  containers:
  - name: producer
    image: aurimasg/producer-yt:0.2.0
    imagePullPolicy: Always
    env:
    - name: DATA_URL
      value: https://d37ci6vzurychx.cloudfront.net/trip-data/green_tripdata_2024-11.parquet
    - name: EVENT_TYPE
      value: GreenTaxiTripRecords
    - name: SCHEMA_VERSION
      value: 0-1-0
    - name: COLLECTOR_URL
      value: http://collector/api/v1/collect</code></code></pre><p>Note that we pass location to the data to be downloaded via an environment variable. You can create the pods by running:</p><pre><code><code>kubectl apply -f pods.yaml</code></code></pre><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Implementing Spark ETL. </h3><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j2JA!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j2JA!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 424w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 848w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 1272w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j2JA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png" width="628" height="527.934065934066" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82f82cb3-ddc1-4ee2-a58a-558accb33793_3843x3230.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1224,&quot;width&quot;:1456,&quot;resizeWidth&quot;:628,&quot;bytes&quot;:579718,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F82f82cb3-ddc1-4ee2-a58a-558accb33793_3843x3230.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j2JA!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 424w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 848w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 1272w, https://substackcdn.com/image/fetch/$s_!j2JA!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F52107b7b-6137-45f2-a4f9-fcc2ab31cd8d_3843x3230.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>We will implement the entire ETL in a single Airflow DAG. The code will also live in a single script for simplicity case. </p><ol><li><p>So the data now continuously lands in the <em>sai-raw</em> bucket, each of the file is a json file with 50 batched records.</p></li><li><p>The first thing we will want to do is move all of the accumulated data into the <em>sai-processing </em>bucket. Why?</p><ol><li><p>We will be reading the data with Spark and it needs a static set of files for processing. Remember, <em>sai-raw </em>is continuously updated with new data.</p></li><li><p>We want to fix the ETL batch and timestamp so that something unexpected happens it is always possible to reprocess.</p></li></ol></li><li><p>We will read the json files in <em>sai-processing </em>bucket with Spark, expand json payloads into structured data and split all of the data into datasets and their versions accordingly.</p></li><li><p>Once Spark processing is completed, we will move the data that is being processed into an archive bucket.</p></li></ol><p></p><h4>Creating Spark Cluster.</h4><p></p><ul><li><p>Creation of the cluster is straightforward, go to Managed Spark section and click on &#8220;+Create cluster&#8221;.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SxXl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SxXl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 424w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 848w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 1272w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SxXl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png" width="1456" height="461" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:461,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:380269,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SxXl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 424w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 848w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 1272w, https://substackcdn.com/image/fetch/$s_!SxXl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8bb41ab3-350c-4644-9c5b-a342f35ba571_3022x956.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Configure it:</p><ul><li><p>Give it a <em>name</em>.</p></li><li><p>Select the beefiest configuration, it&#8217;s free ;)</p></li><li><p>Create a password, you will need it in the Airflow section.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Eaqo!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Eaqo!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 424w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 848w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Eaqo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png" width="1456" height="837" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:837,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1198559,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Eaqo!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 424w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 848w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 1272w, https://substackcdn.com/image/fetch/$s_!Eaqo!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff58e2d50-7124-4531-9c78-c0e0786d4112_3261x1874.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Start a session:</p><ul><li><p>Go to Sessions tab in the created cluster.</p></li><li><p>Click &#8220;+Create session&#8221;</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!GKG9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!GKG9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 424w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 848w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 1272w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!GKG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png" width="1456" height="452" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:452,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:424949,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!GKG9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 424w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 848w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 1272w, https://substackcdn.com/image/fetch/$s_!GKG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdbdebe9f-fa86-426b-8de1-bb42e4260aa2_2920x906.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Most default configurations will be good:</p><ul><li><p><em>Name</em> your session.</p></li><li><p>Give some breathing room to the driver and executors by increasing Disk size a bit.</p></li><li><p>bump the number of executors to 3.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d9ee!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d9ee!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 424w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 848w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 1272w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d9ee!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png" width="585" height="531.5625" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1323,&quot;width&quot;:1456,&quot;resizeWidth&quot;:585,&quot;bytes&quot;:452528,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d9ee!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 424w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 848w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 1272w, https://substackcdn.com/image/fetch/$s_!d9ee!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb9cb700-5b3a-4e60-b281-b97a69960f49_1820x1654.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>That&#8217;s it, we have our Spark cluster ready to take our work.</p><p></p><h4>Running Airflow.</h4><p>We will be deploying Airflow Application on the K8s cluster we have already deployed. Here is how you can easily do it in Nebius:</p><ul><li><p>When in the target cluster window select Applications tab. Look for Airflow and select the card under &#8220;All applications&#8221;.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!H0ZR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!H0ZR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 424w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 848w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 1272w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!H0ZR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png" width="1456" height="1109" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1109,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2837978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!H0ZR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 424w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 848w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 1272w, https://substackcdn.com/image/fetch/$s_!H0ZR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F66ff502f-dbad-4824-997a-ad1c8c97c95d_6306x4805.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>You will be dropped to application details window, continue by selecting deploy. </p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qUic!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qUic!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 424w, https://substackcdn.com/image/fetch/$s_!qUic!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 848w, https://substackcdn.com/image/fetch/$s_!qUic!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 1272w, https://substackcdn.com/image/fetch/$s_!qUic!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qUic!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png" width="1456" height="617" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a56e8863-b699-456b-9943-42c264477097_6713x2843.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:617,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:2787852,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qUic!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 424w, https://substackcdn.com/image/fetch/$s_!qUic!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 848w, https://substackcdn.com/image/fetch/$s_!qUic!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 1272w, https://substackcdn.com/image/fetch/$s_!qUic!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa56e8863-b699-456b-9943-42c264477097_6713x2843.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>In the following window edit some important fields:</p><ul><li><p>In the <em>Namespace</em> field add <em>airflow -</em> it will help us monitor the health of applications as we are deploying more than one in the cluster.</p></li><li><p><em>Webserver Secret Key</em> - follow the instructions on the right to generate the value.</p></li><li><p>In the <em>DAGs git repo</em> field add the url to your github repo. You can also use mine if you just want to follow the example - <em>https://github.com/AurimasGr/sai-nebius-spark.git</em></p></li><li><p><em>DAGs git sub path</em> - point to the folder where you are storing the DAG files in your repository. Mine are under <em>dags </em>as you can see in the picture below.</p></li><li><p>[Important]: I have emitted one field from the image - <em>Custom Airflow Image. </em>We will need a custom one, because we will be running spark directly in Airflow via attached Spark Session, hence we need <em>nebius-connect </em>library installed. You can add <em>aurimasg/airflow </em>there, it will pull the image from my docker hub registry if you keep Airflow version at 2.9.1 as per default. If you want to build your own image, you can do that by going to airflow folder in my github repo and running:</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0LaX!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0LaX!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 424w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 848w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 1272w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0LaX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png" width="286" height="182.28571428571428" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:928,&quot;width&quot;:1456,&quot;resizeWidth&quot;:286,&quot;bytes&quot;:157623,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0LaX!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 424w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 848w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 1272w, https://substackcdn.com/image/fetch/$s_!0LaX!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff3660392-6711-468a-8d1d-aba001e45b89_1779x1134.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><pre><code>docker build . --platform linux/amd64 -t &lt;docker_hub_account&gt;/&lt;repository&gt;:2.9.1</code></pre><pre><code>docker push &lt;docker_hub_account&gt;/&lt;repository&gt;:2.9.1</code></pre><ul><li><p>Then you can also add <em>&lt;docker_hub_account&gt;/&lt;repository&gt; </em>in the <em>Custom Airflow Image</em> field.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nGln!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nGln!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 424w, https://substackcdn.com/image/fetch/$s_!nGln!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 848w, https://substackcdn.com/image/fetch/$s_!nGln!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 1272w, https://substackcdn.com/image/fetch/$s_!nGln!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nGln!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png" width="1456" height="786" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/898a9031-095d-43cc-aba7-30596419318d_6728x3630.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:786,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:3489396,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nGln!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 424w, https://substackcdn.com/image/fetch/$s_!nGln!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 848w, https://substackcdn.com/image/fetch/$s_!nGln!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 1272w, https://substackcdn.com/image/fetch/$s_!nGln!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F898a9031-095d-43cc-aba7-30596419318d_6728x3630.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Deploy the application and wait for it to successfully spin up. You can check status by running:</p></li></ul><pre><code>kubectl get pods -n airflow</code></pre><p>Once fully deployed, you should see something like this:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!mFG9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!mFG9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 424w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 848w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 1272w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!mFG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png" width="590" height="146.65391969407267" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:260,&quot;width&quot;:1046,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:62374,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!mFG9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 424w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 848w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 1272w, https://substackcdn.com/image/fetch/$s_!mFG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F784ce701-fdaf-4c67-b1a3-8c85935a8987_1046x260.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Expose access to airflow by running:</p><pre><code>kubectl -n airflow port-forward services/airflow-webserver 8080:8080</code></pre><p>You can now go to <em>localhost:8080</em> to access Airflow UI. Log in with <em>admin:admin </em>credentials. Congratulations, now you can run your DAGs. If you have already pointed Airflow to the DAG folder that contains the ETL script, you will see some errors. The are all relating to Airflow Variables that need to be configured to run the app. Here is what you need to do:</p><ul><li><p>Go to Admin &#8594; Variables.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uj7-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uj7-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 424w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 848w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 1272w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uj7-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png" width="1456" height="460" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:460,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:423516,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uj7-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 424w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 848w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 1272w, https://substackcdn.com/image/fetch/$s_!uj7-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff31b8ca1-ab76-4d2c-aef5-0127853dc71c_2980x941.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>In the picture I have mine already configured, but here is what you will need:</p><ul><li><p>AWS_ACCESS_KEY_ID_SECRET - AWS like AWS_ACCESS_KEY_ID you generated earlier in collector section.</p></li><li><p>AWS_SECRET_ACCESS_KEY_SECRET - AWS like AWS_SECRET_ACCESS_KEY you generated earlier in collector section.</p></li><li><p>AWS_DEFAULT_REGION - <em>eu-north1</em> if you are running in this region.</p></li><li><p>AWS_ENDPOINT_URL - https://storage.eu-north1.nebius.cloud:443</p></li><li><p>NB_SPARK_SESSION_ENDPOINT - Find it as shown below</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!abfb!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!abfb!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 424w, https://substackcdn.com/image/fetch/$s_!abfb!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 848w, https://substackcdn.com/image/fetch/$s_!abfb!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 1272w, https://substackcdn.com/image/fetch/$s_!abfb!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!abfb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png" width="639" height="241.81936813186815" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:551,&quot;width&quot;:1456,&quot;resizeWidth&quot;:639,&quot;bytes&quot;:555172,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!abfb!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 424w, https://substackcdn.com/image/fetch/$s_!abfb!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 848w, https://substackcdn.com/image/fetch/$s_!abfb!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 1272w, https://substackcdn.com/image/fetch/$s_!abfb!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd4c3d37b-041e-4984-a893-39f6d35624da_2986x1131.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div></li><li><p>NB_SPARK_SESSION_PASSWORD_SECRET - Spark cluster password that you created when bootstrapping the cluster.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_d2d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_d2d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 424w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 848w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 1272w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_d2d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png" width="724" height="337.63461538461536" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:679,&quot;width&quot;:1456,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:499996,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://www.newsletter.swirlai.com/i/156588105?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_d2d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 424w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 848w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 1272w, https://substackcdn.com/image/fetch/$s_!_d2d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F76b0e536-30d4-4b46-a9e9-6134a3031837_2982x1391.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><p>Here is the code we will use:</p><ul><li><p>Definition of the DAG:</p></li></ul><pre><code>with DAG(dag_id="spark_etl", 
         start_date=datetime(2025, 1, 24), 
         max_active_runs=1, catchup=False, 
         schedule="*/20 * * * *") as dag:

    @task()
    def move_data_to_processing():
        move_objects(RAW_LANDING_BUCKET, RAW_PROCESSING_BUCKET)

    @task()
    def run_processing():
        process_data(time_now)

    @task()
    def clean_procesing():
        archive_objects(RAW_PROCESSING_BUCKET, ARCHIVE_BUCKET, time_now)

    move_data_to_processing() &gt;&gt; run_processing() &gt;&gt; clean_procesing()</code></pre><p>It will run every 20 minutes and run functions <em>move_objects(), process_data(), archive_objects() </em>in sequence.</p><p>Here are the functions themselves:</p><ul><li><p><em>move_objects() - </em>simply moves objects from one bucket to another. We will use it to move all data accumulated in <em>sai-raw</em> bucket to <em>sai-processing</em> bucket:</p></li></ul><pre><code>def move_objects(source_bucket, destination_bucket):
    # List all objects in the source bucket
    response = s3.list_objects_v2(Bucket=source_bucket)

    if 'Contents' not in response:
        print("No objects found in the source bucket.")
        return

    for obj in response['Contents']:
        key = obj['Key']
        print(f"Moving: {key}")

        # Copy the object to the destination bucket
        copy_source = {'Bucket': source_bucket, 'Key': key}
        s3.copy_object(CopySource=copy_source, Bucket=destination_bucket, Key=key)

        # Delete the object from the source bucket after copying
        s3.delete_object(Bucket=source_bucket, Key=key)

        print(f"Moved {key} to {destination_bucket}")

    print("All objects have been moved.")

    return "Success"</code></pre><ul><li><p><em>process_data() - </em>this is the spark function that reads all the json objects in <em>sai-procesing</em> bucket, groups events by event_type, major schema, adds etl_timestamp and transforms the data to structured format. Then it writes the result to <em>sai-processed</em> bucket:</p></li></ul><pre><code>def process_data(etl_timestamp):

    from pyspark.sql.connect.session import SparkSession
    from nebius.spark.connect import create_channel_builder
    from os.path import expanduser
    from pyspark.sql.functions import from_json, col, lit

    import urllib.request

    print(os.environ["NB_SPARK_SESSION_ENDPOINT"])

    url = "https://storage.eu-north1.nebius.cloud/msp-certs/ca.pem"

    urllib.request.urlretrieve(url, "ca.pem")

    nebius_spark_endpoint = os.environ["NB_SPARK_SESSION_ENDPOINT"] + ':443'
    nebius_spark_cb = create_channel_builder(
        nebius_spark_endpoint,
        password=os.environ["NB_SPARK_CLUSTER_PASSWORD"],
        root_certificates_file=expanduser('ca.pem') 
    )

    print(expanduser('ca.pem'))

    spark = (SparkSession
        .builder
        .channelBuilder(nebius_spark_cb)
        .getOrCreate())

    df_raw = spark.read.schema(schema).json(f"s3a://{RAW_PROCESSING_BUCKET}/")
    distinct_groups = df_raw.select("event_type", "schema_version").distinct().collect()

    for (event_type, schema_version) in [(str(row["event_type"]), str(row["schema_version"])) for row in distinct_groups]:
        (df_raw
        .filter(df_raw.event_type == event_type)
        .filter(df_raw.schema_version == schema_version)
        .withColumn('payload', from_json(col('payload'), schema_map.get(f"{event_type}_{schema_version}")))
        .withColumn('etl_timestamp', lit(etl_timestamp))
        .select("etl_timestamp", "collector_id", "collector_tstamp", "event_type", "root_id", "schema_version", "payload.*")
        .write
        .mode("append")
        .partitionBy("etl_timestamp")
        .parquet(f"s3a://{PROCESSED_BUCKET}/{event_type}/{schema_version.split('-')[0]}"))

    return "Success"</code></pre><ul><li><p><em>archive_objects() - </em>this is similar to the first function but it moves the data from <em>sai-processing</em> to <em>sai-archive</em> bucket and also adds etl_timestamp hierarchy. Why? We might need to reprocess the batch of raw data, we can always do it from the archive bucket.</p></li></ul><pre><code>def archive_objects(source_bucket, destination_bucket, etl_timestamp):
    # List all objects in the source bucket
    response = s3.list_objects_v2(Bucket=source_bucket)

    if 'Contents' not in response:
        print("No objects found in the source bucket.")
        return

    for obj in response['Contents']:
        key = obj['Key']
        print(f"Moving: {key}")

        # Copy the object to the destination bucket
        copy_source = {'Bucket': source_bucket, 'Key': f"{key}"}
        s3.copy_object(CopySource=copy_source, Bucket=destination_bucket, Key=f"{etl_timestamp}/{key}")

        # Delete the object from the source bucket after copying
        s3.delete_object(Bucket=source_bucket, Key=key)

        print(f"Moved {key} to {destination_bucket}")

    print("All objects have been moved.")

    return "Success"</code></pre><p></p><h4>Congratulations on reaching the end!</h4><p>If you have reached this part of the Newsletter - congratulations! You will be able to see your Spark ETL running every 20 minutes, moving data between the buckets and <em>sai-raw </em>bucket being continuously filled with new raw data. </p><p>If you want to play around more, find new datasources and send them through the collector service. Then explore how the data changes in the <em>sai-processed</em> bucket.</p><p>Let me know if you had any issues while following the project! Hope to see you in the next article :)</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUyNTA4NjQ2LCJpYXQiOjE3NDAzOTYwMTIsImV4cCI6MTc0Mjk4ODAxMiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.hEuIF3u-FXWjvF5Xda_3RqV_HzkhNno-5C5ykS73wUo&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUyNTA4NjQ2LCJpYXQiOjE3NDAzOTYwMTIsImV4cCI6MTc0Mjk4ODAxMiwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.hEuIF3u-FXWjvF5Xda_3RqV_HzkhNno-5C5ykS73wUo"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p>]]></content:encoded></item><item><title><![CDATA[Building AI Agents from scratch - Part 2: Reflection and Working Memory]]></title><description><![CDATA[Let's implement AI Agent from scratch without using any framework. Today we implement the reflection pattern coupled with simple implementation of short-term memory.]]></description><link>https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part-8ca</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part-8ca</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 04 Jan 2025 08:23:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/50268cdf-10a7-48f2-bc57-e7e31e5516f1_2882x2349.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>I hope you had a wonderful holiday season! As we move into the <strong>year of AI Agents</strong>, I also have a present for you, the second part in the series of &#8220;Building AI Agents from scratch&#8221;. If you missed the previous episode (part 1), you can find it here:</p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;218f2235-c0f8-4e61-9b49-6fe1fd3781c8&quot;,&quot;caption&quot;:&quot;&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Building AI Agents from scratch - Part 1: Tool use&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-12-21T10:30:19.983Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/1144abe7-1fb8-4190-b32d-6e59647c858b_2974x2388.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:153433846,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:103,&quot;comment_count&quot;:13,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:false,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div><p>We implemented Tool Use pattern without using any LLM Orchestrator frameworks. Today, we will build on top of the previous project, I will explain how later in the article.</p><p>In this article you will learn:</p><ul><li><p>What Reflection pattern in AI Agent systems is.</p></li><li><p>How it relates to short-term memory.</p></li><li><p>Pros and Cons of implementing Reflection.</p></li><li><p>How to build an Agent class that is able to implement Reflection pattern taking the memory into consideration without using any orchestration frameworks.</p><ul><li><p>Just to remind ourselves why we are doing this. If you are using any orchestration frameworks for agentic applications, you might be abstracted away from how Agentic patterns are actually implemented there. Having clarity of how the systems actually work helps you build up systems thinking, enabling you to craft advanced applications more efficiently.</p></li></ul></li><li><p>Fix some of the hallucinations we were producing in the previous project!</p></li></ul><p>You can find the code examples for this and other projects in my GitHub repository here:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/swirl-ai/ai-angineers-handbook&quot;,&quot;text&quot;:&quot;AI Engineer's Handbook&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/swirl-ai/ai-angineers-handbook"><span>AI Engineer's Handbook</span></a></p><p>Be sure to star the repo if you find the content useful, a lot more to come!</p><p>YouTube video where I go through the first part of the series is coming next week. Be sure to subscribe to not miss it <a href="https://www.youtube.com/@swirlai">here</a>.</p><p>As always, if something does not work as expected, feel free to DM me or leave a comment, let&#8217;s figure it out together!</p><p></p><h3>Defining Reflection in AI Agents.</h3><p>As it is with most definitions in AI Agents nowadays, there is no single way to specifically describe Reflection. The high level definition of the pattern is:</p><blockquote><p>The ability of an Agentic System to reflect on it&#8217;s outputs and suggest improvements. Optionally, also improve the behaviour of the future actions in the system incorporating the feedback provided.</p></blockquote><p>When explaining Agentic concepts I like to drop the abstraction of an Agent and think in Agentic flows - it is easier to reason when analysing flow diagrams and Agents eventually are just a number of steps interconnected via different topologies. Reflection can be applied in different steps of an Agentic flow. Let&#8217;s see how.</p><p></p><h4>The simplest example.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kGSd!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kGSd!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 424w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 848w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 1272w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kGSd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png" width="572" height="326.3273905996758" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/70db21a2-62ca-4f71-a70d-bf88fb10b50a_1234x704.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:704,&quot;width&quot;:1234,&quot;resizeWidth&quot;:572,&quot;bytes&quot;:81252,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kGSd!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 424w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 848w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 1272w, https://substackcdn.com/image/fetch/$s_!kGSd!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa642fd3b-a4bb-4465-bc4b-77948c14edea_1234x704.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Reflection: Simplest Case</figcaption></figure></div><p>The above Agentic flow includes the following steps:</p><ol><li><p>User prompts the LLM with a query.</p></li><li><p>Generated answer is passed to the LLM where it is asked to provide feedback to the previously generated answer and provide instructions for improvement if any.</p></li><li><p>The improved answer is returned to the user.</p></li></ol><p>As simple as it is, even this pipeline will provide significant improvements to the accuracy of the answers in many cases. Ofcourse, similar improvements can be achieved via prompt engineering of the system prompt, but reflection pattern is usually more powerful and flexible.</p><p></p><h4>Reflection Loop.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!neQ7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!neQ7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 424w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 848w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 1272w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!neQ7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png" width="595" height="339.4489465153971" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6113ad59-4a6b-4ad4-b296-04d05af958f7_1234x704.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:704,&quot;width&quot;:1234,&quot;resizeWidth&quot;:595,&quot;bytes&quot;:86133,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!neQ7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 424w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 848w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 1272w, https://substackcdn.com/image/fetch/$s_!neQ7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff19163d0-6ead-4bdb-95d5-fafd5acef297_1234x704.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Reflection: The Loop</figcaption></figure></div><p>The above Agentic flow is defined via the following steps:</p><ol><li><p>A user query is passed to the LLM.</p></li><li><p>Generated answer is passed to the LLM where it is asked to provide feedback to the previously generated answer and provide instructions for improvement if any.</p></li><li><p>After applying the improvements, the improved answer is passed to the LLM again and is asked to provide feedback and suggestions for improvements once more. This loop is then repeated for a predefined amount of times or until the LLM is not able to generate more suggestions and returns a stop sign, usually a predefined string like &#8220;END&#8221;.</p></li><li><p>The final answer is returned to a user. </p></li></ol><p>This kind of system is powerful, but requires a very specific use case. In literature it can be most often found described for code generation. It is easy to understand why - generated code can be continuously improved in multiple iterations. The system will always find where to over-engineer ;)</p><p>Andrew Ng has defined this pattern for code in one of his articles <a href="https://www.deeplearning.ai/the-batch/agentic-design-patterns-part-2-reflection/">here</a>.</p><p>In most real world use cases I would find it hard to make a business case for the use of reflection loop as it usually increases costs significantly.</p><p></p><h4>Validating execution plans.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_6Qg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_6Qg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 424w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 848w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 1272w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_6Qg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png" width="584" height="404.9849246231156" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/328b13b3-7959-4b83-bc56-838237bfe39d_1194x828.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:828,&quot;width&quot;:1194,&quot;resizeWidth&quot;:584,&quot;bytes&quot;:102839,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_6Qg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 424w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 848w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 1272w, https://substackcdn.com/image/fetch/$s_!_6Qg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe45a3fb8-84cd-4775-9b5f-421632628181_1194x828.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Reflection: Validation of execution plans</figcaption></figure></div><p>I find one of the useful spots to place a Reflection step is to validate an execution plan if planning is part of the Agentic flow. Let&#8217;s see how it could look like:</p><ol><li><p>A user query with intent and a system prompt is passed to the LLM. The LLM generates an execution plan. This is also where an important point where the execution of the flow might break is. An example of the breakage would be the following:</p><ol><li><p>Agentic flow decides if a direct answer needs to be returned to the user or a predefined tool should be used for additional context generation.</p></li><li><p>A decision is made that a tool should be used,</p></li><li><p>An incorrect set of parameters to be passed to the tool are hallucinated.</p></li><li><p>The tool returns an error.</p></li></ol></li><li><p>With properly crafted Reflection step we can prompt the Agent to try and fix any hallucinations produced in previous step.</p></li><li><p>The plan can return a direct answer that is provided to the user,</p></li><li><p>Or it can prompt to use a tool which enriches the answer via another LLM call and it then is returned to the user.</p></li></ol><p>We will actually be implementing this kind of flow in our hands-on example in the following sections.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>More Complex Reflection flows.</h4><p>As mentioned before, there is no predefined place for a reflection step to be invoked. In complex Agentic flows it can be used multiple times to validate intermediary answers, plans or any other part of Agentic topology.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!BjrJ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!BjrJ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 424w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 848w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 1272w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!BjrJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png" width="1456" height="795" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c6b25972-25e0-434d-bc82-9b2eba991f26_1704x930.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:795,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:154028,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!BjrJ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 424w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 848w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 1272w, https://substackcdn.com/image/fetch/$s_!BjrJ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F42f7fa39-9c99-41a7-a1fd-badcf622a31c_1704x930.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Reflection: Complex Agentic flows</figcaption></figure></div><p>In reality, in order to automate complex processes that are present in organisations, you would build multi-step topologies with multiple probabilistic routers connecting execution nodes. Some routers could be implemented via LLMs, some might be rule based, some might use regular ML models. Some execution nodes will have tools, some will be just LLM calls, some will have non-probabilistic executions. Reflection steps could be implemented all around the place to increase the accuracy of non-deterministic routers and executions.</p><p></p><h3>Connection between Agent Memory and Reflection pattern.</h3><p>Usually, there is a need for some sort of short-term (working) memory implementation to make Reflection provide best results. Let&#8217;s see why by examining the third example of Agentic flows defined above.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7NSW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7NSW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 424w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 848w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 1272w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7NSW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png" width="582" height="445.03015075376885" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/82633b7e-cfdd-497a-bdfa-28eacb00157e_1194x913.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:913,&quot;width&quot;:1194,&quot;resizeWidth&quot;:582,&quot;bytes&quot;:114212,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7NSW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 424w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 848w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 1272w, https://substackcdn.com/image/fetch/$s_!7NSW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F7b23ab7a-979f-4ba1-acdd-acaca8871040_1194x913.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Working Memory for Reflection</figcaption></figure></div><p>The Agentic flow until the Reflection step can be invoked includes:</p><p><em>A)</em> A System Prompt and a user query that will be passed to the LLM in order to generate the initial Execution Plan.</p><p><em>B)</em> The execution plan generated by the initial steps.</p><p><em>C)</em> The Reflection step will need All of this information since the System prompt will most likely have useful information about the available tools and similar, The user Query is important since it invokes the generation with context about user intent, the plan is important because it is what we are trying to improve.</p><p>In base generation implementation we are not keeping this information in memory, that is why the Working Memory needs to be implemented.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Je5w!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Je5w!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 424w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 848w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 1272w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Je5w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png" width="1456" height="780" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d08e656-9fb1-42f4-bde0-803ad36e5432_2383x1276.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:780,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186766,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Je5w!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 424w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 848w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 1272w, https://substackcdn.com/image/fetch/$s_!Je5w!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F397c1965-c739-4d9b-a821-a06a817e4130_2383x1276.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Working Memory: Simplest implementation</figcaption></figure></div><p>In the above picture I show the simplest implementation of Working Memory - Each of the interactions with an agent are simply stored in the list and passed as additional context to the system prompt each time the generation is invoked. This is how the simplest type of memory is implemented in the Chat Bots (e.g. ChatGPT).</p><p>If we are implementing an Agent with the capability of planning and we want to include the reflection capability, this is how it could look like:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!6Stj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!6Stj!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 424w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 848w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!6Stj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png" width="590" height="404.0041208791209" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/defc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5aabf775-f742-4ece-98b0-0b88de4efe59_1642x1124.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:997,&quot;width&quot;:1456,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:152334,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!6Stj!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 424w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 848w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 1272w, https://substackcdn.com/image/fetch/$s_!6Stj!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdefc4bc5-bdd8-4c1d-9c31-ffc47e56434d_1642x1124.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Working Memory: Planning and Reflection</figcaption></figure></div><p>As you can see, it is very similar, the only difference is that the initial response by the agent is the execution plan and the next query by the user is a prompt to reflect on the plan given previous interactions.</p><p></p><h3>Pros and Cons of using Reflection.</h3><p>As with anything that can bring improvements to the system, there is always upsides and downsides.</p><h4>Pros.</h4><ul><li><p>Almost guaranteed improvement in the accuracy of final outputs of the system. Sometimes this is the only way to make your application feasible due to extremely high accuracy requirements.</p></li><li><p>Flexible compared to editing the initial system prompt. E.g. different Reflection methods or even agents can be utilised for different parts of the pipeline.</p></li><li><p>It is possible to achieve more with small models when Reflection is applied. Ofcourse, you should analyse the tradeoff given all of the cons but in general there is always a phase in the project where the goal is to optimise the costs of your AI system.</p></li></ul><h4>Cons.</h4><ul><li><p>Adds complexity to the application.</p></li><li><p>Adds additional latency to the end-to-end flow since additional LLM calls are invoked.</p></li><li><p>Adds additional cost since the LLMs are prompted at least one more additional time (usually more than one) with every Reflection step.</p></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>Building the Reflection Agent.</h3><p>As mentioned at the beginning of the article, we will be working on the example where Reflection will be used to revise an action plan generated by a LLM. And this is not without a reason!</p><p>If you followed me in the <a href="https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part">first part</a> of the series where we implemented tool usage from scratch, you might remember that we implemented a tool that is capable of converting between two currencies on demand. We prompted the agent to only use the tool if the conversion is actually needed. It worked well, but in the example we asked for Serbian to Japanese currency conversion. </p><p>Now, this was not my initial intent. I am from Lithuania and I wanted to showcase the scenario where I would be traveling from Lithuania to Japan so I initially prompted the model with the following query:</p><pre><code>I am traveling to Japan from Lithuania, I have 1500 of local currency, how much of Japanese currency will I be able to get?</code></pre><p>Imagine my surprise, when the agent returned the following response:</p><pre><code>Thought: I need to convert 1500 Lithuanian Litas (LTL) to Japanese Yen (JPY) using the currency conversion tool.
Plan: Use convert_currency tool to convert 1500 LTL to JPY. Return the conversion result
Results: Error: Could not fetch exchange rates</code></pre><p>Ofcourse the tool call resulted in an error - Lithuania has Euro (EUR) as an official currency from 2015! The tool was obviously not able to find LTL conversion rate to JPY.</p><p>And then I took it personal, in this episode my quest is to try and fix the plan using the Reflection pattern.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wvjR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wvjR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 424w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 848w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 1272w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wvjR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png" width="568" height="434.3249581239531" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/6841ecfd-3ca9-4b29-b21f-076ed38e98cc_1194x913.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:913,&quot;width&quot;:1194,&quot;resizeWidth&quot;:568,&quot;bytes&quot;:114212,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wvjR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 424w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 848w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 1272w, https://substackcdn.com/image/fetch/$s_!wvjR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5d419729-d19c-42d0-a407-9c36ea5b9073_1194x913.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Reflection: Fixing the Plan</figcaption></figure></div><p>You can also find the code in a GitHub repository <a href="https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/planning/reflection">here</a>.</p><p>You can follow the tutorial using the Jupyter notebook <a href="https://github.com/swirl-ai/ai-angineers-handbook/blob/main/building_agents_from_scratch/planning/reflection/notebooks/reflection.ipynb">here</a>.</p><p></p><h4>Implementing the working memory.</h4><p>We will implement a single interaction with an agent as a Dataclass:</p><pre><code>@dataclass
class Interaction:
    """Record of a single interaction with the agent"""
    timestamp: datetime
    query: str
    plan: Dict[str, Any]</code></pre><p>It will have the query and the plan that the agent produced.</p><p>It is important to note that we will also need the system prompt to be available for reflection step, but we will implement that separately.</p><p>We will be simplifying the Agent Class this time around by stripping any tool related functionality from it to better focus on reflecting on the plan generated by the Agent.</p><p></p><h4>The initial system prompt.</h4><p>We will keep the system prompt identical to the one we implemented in the part 1 of the series. You can find the explanation in <a href="https://www.newsletter.swirlai.com/i/153433846/crafting-the-system-prompt">this</a> section of my previous article. The two differences are:</p><ul><li><p>We are now mocking the available tool instead of actually implementing it. So the tools section will look like this:</p></li></ul><pre><code>"tools": [
    {
        "name": "convert_currency",
        "description": "Converts currency using latest exchange rates.",
        "parameters": {
            "amount": {
                "type": "float",
                "description": "Amount to convert"
            },
            "from_currency": {
                "type": "str",
                "description": "Source currency code (e.g., USD)"
            },
            "to_currency": {
                "type": "str",
                "description": "Target currency code (e.g., EUR)"
            }
        }
    }
]</code></pre><ul><li><p>We are extending the capabilities and instructions of the agent by including the fourth line in both:</p></li></ul><pre><code>"capabilities": [
    "Using provided tools to help users when necessary",
    "Responding directly without tools for questions that don't require tool usage",
    "Planning efficient tool usage sequences",
    "If asked by the user, reflecting on the plan and suggesting changes if needed"
],
"instructions": [
    "Use tools only when they are necessary for the task",
    "If a query can be answered directly, respond with a simple message instead of using tools",
    "When tools are needed, plan their usage efficiently to minimize tool calls",
    "If asked by the user, reflect on the plan and suggest changes if needed"
]</code></pre><p></p><h4>Implementing the Agent Class.</h4><p>The agent class will be initialised with an empty Interaction history. Interaction history IS the working memory in this example.</p><pre><code>class Agent:
    def __init__(self, model: str = "gpt-4o-mini"):
        """Initialize Agent with empty interaction history."""
        self.client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.interactions: List[Interaction] = []  # Working memory
        self.model = model</code></pre><p></p><h4><em>Planning.</em></h4><p>The <em>plan</em> method prompts the LLM to create the initial execution plan and updates the working memory with the user query and the generated plan.</p><pre><code>def plan(self, user_query: str) -&gt; Dict:
    """Use LLM to create a plan and store it in memory."""
    messages = [
        {"role": "system", "content": self.create_system_prompt()},
        {"role": "user", "content": user_query}
    ]
    
    response = self.client.chat.completions.create(
        model=self.model,
        messages=messages,
        temperature=0
    )
    
    try:
        plan = json.loads(response.choices[0].message.content)
        # Store the interaction immediately after planning
        interaction = Interaction(
            timestamp=datetime.now(),
            query=user_query,
            plan=plan
        )
        self.interactions.append(interaction)
        return plan
    except json.JSONDecodeError:
        raise ValueError("Failed to parse LLM response as JSON")</code></pre><p></p><h4><em>Reflecting on the plan.</em></h4><p>The <em>reflect_on_plan</em> method is where the interesting part is.</p><pre><code>def reflect_on_plan(self) -&gt; Dict[str, Any]:
    """Reflect on the most recent plan using interaction history."""
    if not self.interactions:
        return {"reflection": "No plan to reflect on", "requires_changes": False}
    
    latest_interaction = self.interactions[-1]
    
    reflection_prompt = {
        "task": "reflection",
        "context": {
            "user_query": latest_interaction.query,
            "generated_plan": latest_interaction.plan
        },
        "instructions": [
            "Review the generated plan for potential improvements",
            "Consider if the chosen tools are appropriate",
            "Verify tool parameters are correct",
            "Check if the plan is efficient",
            "Determine if tools are actually needed"
        ],
        "response_format": {
            "type": "json",
            "schema": {
                "requires_changes": {
                    "type": "boolean",
                    "description": "whether the plan needs modifications"
                },
                "reflection": {
                    "type": "string",
                    "description": "explanation of what changes are needed or why no changes are needed"
                },
                "suggestions": {
                    "type": "array",
                    "items": {"type": "string"},
                    "description": "specific suggestions for improvements",
                    "optional": True
                }
            }
        }
    }
    
    messages = [
        {"role": "system", "content": self.create_system_prompt()},
        {"role": "user", "content": json.dumps(reflection_prompt, indent=2)}
    ]
    
    response = self.client.chat.completions.create(
        model=self.model,
        messages=messages,
        temperature=0
    )
    
    try:
        return json.loads(response.choices[0].message.content)
    except json.JSONDecodeError:
        return {"reflection": response.choices[0].message.content}</code></pre><p>We create a new prompt with specific instructions and desired output format. Note that the prompt has both the user query and the initial plan passed as additional context via a <em>context </em>key<em>. </em>Then we pass the prompt together with the initial system prompt to generate reflection on the initially generated plan including any suggestions for improvements.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4><em>Executing the flow and generating Revised Plan.</em></h4><p>We now stitch the flow together via the <em>execute </em>method.</p><pre><code>def execute(self, user_query: str) -&gt; str:
    """Execute the full pipeline: plan, reflect, and potentially replan."""
    try:
        # Create initial plan (this also stores it in memory)
        initial_plan = self.plan(user_query)
        
        # Reflect on the plan using memory
        reflection = self.reflect_on_plan()
        
        # Check if reflection suggests changes
        if reflection.get("requires_changes", False):
            # Generate new plan based on reflection
            messages = [
                {"role": "system", "content": self.create_system_prompt()},
                {"role": "user", "content": user_query},
                {"role": "assistant", "content": json.dumps(initial_plan)},
                {"role": "user", "content": f"Please revise the plan based on this feedback: {json.dumps(reflection)}"}
            ]
            
            response = self.client.chat.completions.create(
                model=self.model,
                messages=messages,
                temperature=0
            )
            
            try:
                final_plan = json.loads(response.choices[0].message.content)
            except json.JSONDecodeError:
                final_plan = initial_plan  # Fallback to initial plan if parsing fails
        else:
            final_plan = initial_plan
        
        # Update the stored interaction with all information
        self.interactions[-1].plan = {
            "initial_plan": initial_plan,
            "reflection": reflection,
            "final_plan": final_plan
        }
        
        # Return the appropriate response
        if final_plan.get("requires_tools", True):
            return f"""Initial Thought: {initial_plan['thought']}
Initial Plan: {'. '.join(initial_plan['plan'])}
Reflection: {reflection.get('reflection', 'No improvements suggested')}
Final Plan: {'. '.join(final_plan['plan'])}"""
        else:
            return f"""Response: {final_plan['direct_response']}
Reflection: {reflection.get('reflection', 'No improvements suggested')}"""
        
    except Exception as e:
        return f"Error executing plan: {str(e)}"</code></pre><p>Note that the method applies reflection improvement on the plan ONLY if requested by the reflection generation. If not, we keep the old plan.</p><p></p><h4>Executing the Agent.</h4><p>Let&#8217;s see if we managed to fix the plan! Execute the agent by running:</p><pre><code>query_list = ["I am traveling to Japan from Lithuania, I have 1500 of local currency, how much of Japanese currency will I be able to get?"]

for query in query_list:
    print(f"\nQuery: {query}")
    result = agent.execute(query)
    print(result)</code></pre><p>If you used the same models as me, you should be getting something similar to:</p><pre><code>Query: I am traveling to Japan from Lithuania, I have 1500 of local currency, how much of Japanese currency will I be able to get?
Initial Thought: I need to convert 1500 Lithuanian Litas (LTL) to Japanese Yen (JPY) using the currency conversion tool.
Initial Plan: Use convert_currency tool to convert 1500 LTL to JPY. Return the conversion result
Reflection: The plan needs modifications because the Lithuanian Litas (LTL) is no longer in use since Lithuania adopted the Euro (EUR) in 2015. Therefore, the conversion should be from EUR to JPY instead of LTL.
Final Plan: Use convert_currency tool to convert 1500 EUR to JPY. Return the conversion result</code></pre><p>Great success! we have fixed the plan and would be able to execute the tool successfully.</p><p></p><h4>That&#8217;s it for today, we&#8217;ve learned:</h4><ul><li><p>How to implement a simple type of Working Memory.</p></li><li><p>How to construct a Reflection step to reflect on the execution plan generated by the Agent.</p></li><li><p>How to implement the suggestions generated by the reflection step and improve the plan.</p><div><hr></div></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[Building AI Agents from scratch - Part 1: Tool use]]></title><description><![CDATA[Let's implement AI Agent from scratch without using any framework. Today we implement the tool use capability.]]></description><link>https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/building-ai-agents-from-scratch-part</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 21 Dec 2024 10:30:19 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/1144abe7-1fb8-4190-b32d-6e59647c858b_2974x2388.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p></p><div><hr></div><p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>First of all, I want to wish you a joyful and peaceful holiday season in advance!</p><p>This is the first article in the series where we will build AI Agents from scratch without using any LLM orchestration frameworks. In this one you will learn:</p><ul><li><p>What are agents?</p></li><li><p>How the Tool usage actually works.</p></li><li><p>How to build a decorator wrapper that extracts relevant details from a Python function to be passed to the LLM via system prompt.</p></li><li><p>How to think about constructing effective system prompts that can be used for Agents.</p></li><li><p>How to build an Agent class that is able to plan and execute actions using provided Tools.</p></li></ul><p>You can find the code examples for this and following projects in GitHub repository here:</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://github.com/swirl-ai/ai-angineers-handbook&quot;,&quot;text&quot;:&quot;AI Engineer's Handbook&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://github.com/swirl-ai/ai-angineers-handbook"><span>AI Engineer's Handbook</span></a></p><p>If something does not work as expected, feel free to DM me or leave a comment, let&#8217;s figure it out together!</p><div><hr></div><blockquote><p>&#8220;The future of AI is Agentic.&#8221;</p></blockquote><blockquote><p>&#8220;Year 2025 will be the year of Agents.&#8221;</p></blockquote><p>These are the phrases you hear nowadays left and right. And there is a lot of truth to it. In order to bring the most business value out of LLMs, we are turning to complex agentic flows.</p><h3>What is an AI Agent?</h3><p>In it&#8217;s simplest high level definition, an AI agent is an application that uses LLM at the core as it&#8217;s reasoning engine to decide on the steps it needs to take to solve for users intent. It is usually depicted similar to the picture bellow and is composed of multiple building blocks:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!fVcp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!fVcp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 424w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 848w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1272w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png" width="1456" height="1094" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c44a6b00-0d44-4efd-8fab-28e035b662d2_2926x2198.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1094,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:294780,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!fVcp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 424w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 848w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1272w, https://substackcdn.com/image/fetch/$s_!fVcp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F3eb64772-fbb5-4f2d-8120-d473c74fe124_2926x2198.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">AI Agent</figcaption></figure></div><ul><li><p>Planning - the capability to plan a sequence of actions that the application needs to perform in order to solve for the provided intent.</p></li><li><p>Memory - short-term and long-term memory containing any information that the agent might need to reason about the actions it needs to take. This information is usually passed to LLM via a system prompt as part of the core. You can read more about different types of memories in one of my previous articles: </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;d5488b54-1bdb-4b59-9936-c08e79c96d77&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Memory in Agent Systems&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-10-30T10:03:28.773Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c7650705-54b4-49a3-91a4-aad0c4093c4b_2926x2198.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/memory-in-agent-systems&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:150888366,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:42,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></li><li><p>Tools - any functions that the application can call to enhance it&#8217;s reasoning capabilities. One should not be fooled by the simplicity of this definition as a tool can be literally anything:</p><ul><li><p>Simple functions defined in code.</p></li><li><p>VectorDBs and other data stores containing context.</p></li><li><p>Regular Machine Learning model APIs.</p></li><li><p>Other Agents!</p></li><li><p>&#8230;</p><p></p></li></ul></li></ul><p>In the following set of articles, I will implement most of the moving parts of an agent from scratch without using any orchestration frameworks. This episode is about Tool use.</p><p>If you are using any orchestration frameworks for agentic applications, you might be abstracted away from what using a tool really means. This article will help you understand what providing a tool and using it via an agent involves. I believe that understanding applications from the base building blocks is really important for few reasons:</p><ul><li><p>Frameworks hide the implementation details of the system prompts used, different approaches might be needed in different use cases.</p></li><li><p>You might want to tune the low level details to achieve most optimal performance of the agent.</p></li><li><p>Having clarity of how the systems actually work helps build up your systems thinking enabling you to craft advanced applications more efficiently.</p></li></ul><p></p><h3>Tool use on a high level.</h3><p>The basic thing one needs to understand when building agentic applications is that LLMs do not run code, they are only used to produce intent via prompting. Why can ChatGPT browse the internet and return more accurate and recent results? Because ChatGPT IS an agent and there are many non LLM building blocks hidden from us behind the API.</p><p>Prompt engineering becomes critical when building agentic applications. More specifically, how you craft the system prompt. Simplified prompt structure looks like the following.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rZHR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rZHR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 424w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 848w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 1272w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rZHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png" width="1456" height="594" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ba764c91-f7e0-4597-84b9-fbdbae3509a9_1878x766.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:594,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:98372,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rZHR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 424w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 848w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 1272w, https://substackcdn.com/image/fetch/$s_!rZHR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F663cac67-4b46-428f-8876-d648f621f0e5_1878x766.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Prompt Structure</figcaption></figure></div><p>The agent will only perform well if you are able to efficiently provide the system prompt with available tool definitions and expected outputs which are in a form of planned actions or raw answers.</p><p></p><h3>Implementing the Agent.</h3><p>In this part, we will create an AI Agent, that is capable of checking currency conversion rates online and performing the conversion if needed to answer the user query.</p><p>You can also find the code in a GitHub repository <a href="https://github.com/swirl-ai/ai-angineers-handbook/tree/main/building_agents_from_scratch/tool_use">here</a>.</p><p>You can follow the tutorial using the Jupyter notebook <a href="https://github.com/swirl-ai/ai-angineers-handbook/blob/main/building_agents_from_scratch/tool_use/notebooks/tool_use.ipynb">here</a>.</p><p>I will also create a Youtube video explaining the process in the following weeks. If you don&#8217;t want to miss it, you can subscribe to the Youtube channel <a href="https://www.youtube.com/@swirlai">here</a>.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Preparing python functions to be used as tools.</h4><p>The easiest and most convenient way to provide tools to an agent is through functions, in our project we will be using Python for this.</p><p>We do not need to provide the function code itself to the system prompt but we need to extract useful information about it so that LLM can decide if and how the function should be invoked.</p><p>We&#8217;ll define a dataclass that contains desired information including the function runnable.</p><pre><code>@dataclass
class Tool:
    name: str
    description: str
    func: Callable[..., str]
    parameters: Dict[str, Dict[str, str]]
    
    def __call__(self, *args, **kwargs) -&gt; str:
        return self.func(*args, **kwargs)</code></pre><p>The information we are extracting includes:</p><ul><li><p>Function name.</p></li><li><p>function description (we will extract this from a docstring).</p></li><li><p>Function callable so that we can invoke it as part of the agent.</p></li><li><p>Parameters that should be used with the function so that the LLM can decide on how to call the function.</p></li></ul><p>Now we will need to extract the above information from the functions we define. One requirement for the functions we will enforce is to have properly formatted docstrings. We will require the following format:</p><pre><code>"""Description of what the tool does.

Parameters:
    - param1: Description of first parameter
    - param2: Description of second parameter
"""</code></pre><p>The following function extracts information about parameters - parameter names and descriptions.</p><pre><code>def parse_docstring_params(docstring: str) -&gt; Dict[str, str]:
    """Extract parameter descriptions from docstring."""
    if not docstring:
        return {}
    
    params = {}
    lines = docstring.split('\n')
    in_params = False
    current_param = None
    
    for line in lines:
        line = line.strip()
        if line.startswith('Parameters:'):
            in_params = True
        elif in_params:
            if line.startswith('-') or line.startswith('*'):
                current_param = line.lstrip('- *').split(':')[0].strip()
                params[current_param] = line.lstrip('- *').split(':')[1].strip()
            elif current_param and line:
                params[current_param] += ' ' + line.strip()
            elif not line:
                in_params = False
    
    return params</code></pre><p>We will be extracting function parameter types from typehints provided via function definition. The bellow function will help format them.</p><pre><code>def get_type_description(type_hint: Any) -&gt; str:
    """Get a human-readable description of a type hint."""
    if isinstance(type_hint, _GenericAlias):
        if type_hint._name == 'Literal':
            return f"one of {type_hint.__args__}"
    return type_hint.__name__</code></pre><p>A very convenient way to turn a function into a tool is to use a decorator. The below code defines a tool decorator that wraps a function if used. It uses either function name for the tool name or a variable provided via decorator.</p><pre><code>def tool(name: str = None):
    def decorator(func: Callable[..., str]) -&gt; Tool:
        tool_name = name or func.__name__
        description = inspect.getdoc(func) or "No description available"
        
        type_hints = get_type_hints(func)
        param_docs = parse_docstring_params(description)
        sig = inspect.signature(func)
        
        params = {}
        for param_name, param in sig.parameters.items():
            params[param_name] = {
                "type": get_type_description(type_hints.get(param_name, Any)),
                "description": param_docs.get(param_name, "No description available")
            }
        
        return Tool(
            name=tool_name,
            description=description.split('\n\n')[0],
            func=func,
            parameters=params
        )
    return decorator</code></pre><h4><br>Currency exchange tool.</h4><p>The below creates a tool from a function that takes in the amount of currency to exchange from, the currency code to be converted from and the currency code to convert to. The function searches for the relevant currency exchange rate and performs the calculation of resulting currency amount.</p><pre><code>@tool()
def convert_currency(amount: float, from_currency: str, to_currency: str) -&gt; str:
    """Converts currency using latest exchange rates.
    
    Parameters:
        - amount: Amount to convert
        - from_currency: Source currency code (e.g., USD)
        - to_currency: Target currency code (e.g., EUR)
    """
    try:
        url = f"https://open.er-api.com/v6/latest/{from_currency.upper()}"
        with urllib.request.urlopen(url) as response:
            data = json.loads(response.read())
            
        if "rates" not in data:
            return "Error: Could not fetch exchange rates"
            
        rate = data["rates"].get(to_currency.upper())
        if not rate:
            return f"Error: No rate found for {to_currency}"
            
        converted = amount * rate
        return f"{amount} {from_currency.upper()} = {converted:.2f} {to_currency.upper()}"
        
    except Exception as e:
        return f"Error converting currency: {str(e)}"</code></pre><p>Let&#8217;s just run</p><pre><code>convert_currency</code></pre><p>It should return something like</p><pre><code>Tool(name='convert_currency', description='Converts currency using latest exchange rates.', func=&lt;function convert_currency at 0x106d8fa60&gt;, parameters={'amount': {'type': 'float', 'description': 'Amount to convert'}, 'from_currency': {'type': 'str', 'description': 'Source currency code (e.g., USD)'}, 'to_currency': {'type': 'str', 'description': 'Target currency code (e.g., EUR)'}})</code></pre><p>This is great! We have successfully extracted information we will be providing to the LLM as a tool definition.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Crafting the system prompt.</h4><p>We will be using gpt-4o-mini as our reasoning engine. It is known that GPT model family performs better when the input prompt is formatted as json. So we will do exactly that. Actually, the system prompt is the most important part of our agent, here is the final one we will be using:</p><pre><code>{
    "role": "AI Assistant",
    "capabilities": [
        "Using provided tools to help users when necessary",
        "Responding directly without tools for questions that don't require tool usage",
        "Planning efficient tool usage sequences"
    ],
    "instructions": [
        "Use tools only when they are necessary for the task",
        "If a query can be answered directly, respond with a simple message instead of using tools",
        "When tools are needed, plan their usage efficiently to minimize tool calls"
    ],
    "tools": [
        {
            "name": tool.name,
            "description": tool.description,
            "parameters": {
                name: {
                    "type": info["type"],
                    "description": info["description"]
                }
                for name, info in tool.parameters.items()
            }
        }
        for tool in self.tools.values()
    ],
    "response_format": {
        "type": "json",
        "schema": {
            "requires_tools": {
                "type": "boolean",
                "description": "whether tools are needed for this query"
            },
            "direct_response": {
                "type": "string",
                "description": "response when no tools are needed",
                "optional": True
            },
            "thought": {
                "type": "string", 
                "description": "reasoning about how to solve the task (when tools are needed)",
                "optional": True
            },
            "plan": {
                "type": "array",
                "items": {"type": "string"},
                "description": "steps to solve the task (when tools are needed)",
                "optional": True
            },
            "tool_calls": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "tool": {
                            "type": "string",
                            "description": "name of the tool"
                        },
                        "args": {
                            "type": "object",
                            "description": "parameters for the tool"
                        }
                    }
                },
                "description": "tools to call in sequence (when tools are needed)",
                "optional": True
            }
        },
        "examples": [
            {
                "query": "Convert 100 USD to EUR",
                "response": {
                    "requires_tools": True,
                    "thought": "I need to use the currency conversion tool to convert USD to EUR",
                    "plan": [
                        "Use convert_currency tool to convert 100 USD to EUR",
                        "Return the conversion result"
                    ],
                    "tool_calls": [
                        {
                            "tool": "convert_currency",
                            "args": {
                                "amount": 100,
                                "from_currency": "USD", 
                                "to_currency": "EUR"
                            }
                        }
                    ]
                }
            },
            {
                "query": "What's 500 Japanese Yen in British Pounds?",
                "response": {
                    "requires_tools": True,
                    "thought": "I need to convert JPY to GBP using the currency converter",
                    "plan": [
                        "Use convert_currency tool to convert 500 JPY to GBP",
                        "Return the conversion result"
                    ],
                    "tool_calls": [
                        {
                            "tool": "convert_currency",
                            "args": {
                                "amount": 500,
                                "from_currency": "JPY",
                                "to_currency": "GBP"
                            }
                        }
                    ]
                }
            },
            {
                "query": "What currency does Japan use?",
                "response": {
                    "requires_tools": False,
                    "direct_response": "Japan uses the Japanese Yen (JPY) as its official currency. This is common knowledge that doesn't require using the currency conversion tool."
                }
            }
        ]
    }
}</code></pre><p>A lot to unpack, let&#8217;s analyse it step by step:</p><pre><code>"role": "AI Assistant",
"capabilities": [
    "Using provided tools to help users when necessary",
    "Responding directly without tools for questions that don't require tool usage",
    "Planning efficient tool usage sequences"
],
"instructions": [
    "Use tools only when they are necessary for the task",
    "If a query can be answered directly, respond with a simple message instead of using tools",
    "When tools are needed, plan their usage efficiently to minimize tool calls"
]</code></pre><p>This is where we define the qualities of the Agent, in general we are enforcing the behaviour that tools should be used only when necessary.</p><pre><code>"tools": [
    {
        "name": tool.name,
        "description": tool.description,
        "parameters": {
            name: {
                "type": info["type"],
                "description": info["description"]
            }
            for name, info in tool.parameters.items()
        }
    }
    for tool in self.tools.values()
]</code></pre><p>This is where we unpack the tools into a list. The tool list will be part of Agent class, that is why we loop through self.tools. Remember, each tool is defined by the Dataclass we created in the first part.</p><pre><code>"response_format": {
    "type": "json",
    "schema": {
        "requires_tools": {
            "type": "boolean",
            "description": "whether tools are needed for this query"
        },
        "direct_response": {
            "type": "string",
            "description": "response when no tools are needed",
            "optional": True
        },
        "thought": {
            "type": "string", 
            "description": "reasoning about how to solve the task (when tools are needed)",
            "optional": True
        },
        "plan": {
            "type": "array",
            "items": {"type": "string"},
            "description": "steps to solve the task (when tools are needed)",
            "optional": True
        },
        "tool_calls": {
            "type": "array",
            "items": {
                "type": "object",
                "properties": {
                    "tool": {
                        "type": "string",
                        "description": "name of the tool"
                    },
                    "args": {
                        "type": "object",
                        "description": "parameters for the tool"
                    }
                }
            },
            "description": "tools to call in sequence (when tools are needed)",
            "optional": True
        }
    }
}</code></pre><p>Above enforces the LLM output schema. We provide strict instructions here:</p><ul><li><p>requires_tools: return if tool usage is required.</p></li><li><p>direct_response: if above is false return a direct response.</p></li><li><p>thought: description on how the task should be solved.</p></li><li><p>plan: steps to solve the task.</p></li><li><p>tool_calls: tool calls in sequence including functions and parameters to be used. Our example only includes one tool, but it does not necessarily have to.</p></li></ul><pre><code>"examples": [
    {
        "query": "Convert 100 USD to EUR",
        "response": {
            "requires_tools": True,
            "thought": "I need to use the currency conversion tool to convert USD to EUR",
            "plan": [
                "Use convert_currency tool to convert 100 USD to EUR",
                "Return the conversion result"
            ],
            "tool_calls": [
                {
                    "tool": "convert_currency",
                    "args": {
                        "amount": 100,
                        "from_currency": "USD", 
                        "to_currency": "EUR"
                    }
                }
            ]
        }
    },
    {
        "query": "What's 500 Japanese Yen in British Pounds?",
        "response": {
            "requires_tools": True,
            "thought": "I need to convert JPY to GBP using the currency converter",
            "plan": [
                "Use convert_currency tool to convert 500 JPY to GBP",
                "Return the conversion result"
            ],
            "tool_calls": [
                {
                    "tool": "convert_currency",
                    "args": {
                        "amount": 500,
                        "from_currency": "JPY",
                        "to_currency": "GBP"
                    }
                }
            ]
        }
    },
    {
        "query": "What currency does Japan use?",
        "response": {
            "requires_tools": False,
            "direct_response": "Japan uses the Japanese Yen (JPY) as its official currency. This is common knowledge that doesn't require using the currency conversion tool."
        }
    }
]</code></pre><p>Finally, we provide some examples of correct reasoning above.</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Implementing the Agent Class</h4><p>The agent class is quite lengthy due to the long system prompt:</p><pre><code>class Agent:
    def __init__(self):
        """Initialize Agent with empty tool registry."""
        self.client = openai.OpenAI(api_key=os.getenv("OPENAI_API_KEY"))
        self.tools: Dict[str, Tool] = {}
    
    def add_tool(self, tool: Tool) -&gt; None:
        """Register a new tool with the agent."""
        self.tools[tool.name] = tool
    
    def get_available_tools(self) -&gt; List[str]:
        """Get list of available tool descriptions."""
        return [f"{tool.name}: {tool.description}" for tool in self.tools.values()]
    
    def use_tool(self, tool_name: str, **kwargs: Any) -&gt; str:
        """Execute a specific tool with given arguments."""
        if tool_name not in self.tools:
            raise ValueError(f"Tool '{tool_name}' not found. Available tools: {list(self.tools.keys())}")
        
        tool = self.tools[tool_name]
        return tool.func(**kwargs)

    def create_system_prompt(self) -&gt; str:
        """Create the system prompt for the LLM with available tools."""
        tools_json = {
            "role": "AI Assistant",
            "capabilities": [
                "Using provided tools to help users when necessary",
                "Responding directly without tools for questions that don't require tool usage",
                "Planning efficient tool usage sequences"
            ],
            "instructions": [
                "Use tools only when they are necessary for the task",
                "If a query can be answered directly, respond with a simple message instead of using tools",
                "When tools are needed, plan their usage efficiently to minimize tool calls"
            ],
            "tools": [
                {
                    "name": tool.name,
                    "description": tool.description,
                    "parameters": {
                        name: {
                            "type": info["type"],
                            "description": info["description"]
                        }
                        for name, info in tool.parameters.items()
                    }
                }
                for tool in self.tools.values()
            ],
            "response_format": {
                "type": "json",
                "schema": {
                    "requires_tools": {
                        "type": "boolean",
                        "description": "whether tools are needed for this query"
                    },
                    "direct_response": {
                        "type": "string",
                        "description": "response when no tools are needed",
                        "optional": True
                    },
                    "thought": {
                        "type": "string", 
                        "description": "reasoning about how to solve the task (when tools are needed)",
                        "optional": True
                    },
                    "plan": {
                        "type": "array",
                        "items": {"type": "string"},
                        "description": "steps to solve the task (when tools are needed)",
                        "optional": True
                    },
                    "tool_calls": {
                        "type": "array",
                        "items": {
                            "type": "object",
                            "properties": {
                                "tool": {
                                    "type": "string",
                                    "description": "name of the tool"
                                },
                                "args": {
                                    "type": "object",
                                    "description": "parameters for the tool"
                                }
                            }
                        },
                        "description": "tools to call in sequence (when tools are needed)",
                        "optional": True
                    }
                },
                "examples": [
                    {
                        "query": "Convert 100 USD to EUR",
                        "response": {
                            "requires_tools": True,
                            "thought": "I need to use the currency conversion tool to convert USD to EUR",
                            "plan": [
                                "Use convert_currency tool to convert 100 USD to EUR",
                                "Return the conversion result"
                            ],
                            "tool_calls": [
                                {
                                    "tool": "convert_currency",
                                    "args": {
                                        "amount": 100,
                                        "from_currency": "USD", 
                                        "to_currency": "EUR"
                                    }
                                }
                            ]
                        }
                    },
                    {
                        "query": "What's 500 Japanese Yen in British Pounds?",
                        "response": {
                            "requires_tools": True,
                            "thought": "I need to convert JPY to GBP using the currency converter",
                            "plan": [
                                "Use convert_currency tool to convert 500 JPY to GBP",
                                "Return the conversion result"
                            ],
                            "tool_calls": [
                                {
                                    "tool": "convert_currency",
                                    "args": {
                                        "amount": 500,
                                        "from_currency": "JPY",
                                        "to_currency": "GBP"
                                    }
                                }
                            ]
                        }
                    },
                    {
                        "query": "What currency does Japan use?",
                        "response": {
                            "requires_tools": False,
                            "direct_response": "Japan uses the Japanese Yen (JPY) as its official currency. This is common knowledge that doesn't require using the currency conversion tool."
                        }
                    }
                ]
            }
        }
        
        return f"""You are an AI assistant that helps users by providing direct answers or using tools when necessary.
Configuration, instructions, and available tools are provided in JSON format below:

{json.dumps(tools_json, indent=2)}

Always respond with a JSON object following the response_format schema above. 
Remember to use tools only when they are actually needed for the task."""

    def plan(self, user_query: str) -&gt; Dict:
        """Use LLM to create a plan for tool usage."""
        messages = [
            {"role": "system", "content": self.create_system_prompt()},
            {"role": "user", "content": user_query}
        ]
        
        response = self.client.chat.completions.create(
            model="gpt-4o-mini",
            messages=messages,
            temperature=0
        )
        
        try:
            return json.loads(response.choices[0].message.content)
        except json.JSONDecodeError:
            raise ValueError("Failed to parse LLM response as JSON")

    def execute(self, user_query: str) -&gt; str:
        """Execute the full pipeline: plan and execute tools."""
        try:
            plan = self.plan(user_query)
            
            if not plan.get("requires_tools", True):
                return plan["direct_response"]
            
            # Execute each tool in sequence
            results = []
            for tool_call in plan["tool_calls"]:
                tool_name = tool_call["tool"]
                tool_args = tool_call["args"]
                result = self.use_tool(tool_name, **tool_args)
                results.append(result)
            
            # Combine results
            return f"""Thought: {plan['thought']}
Plan: {'. '.join(plan['plan'])}
Results: {'. '.join(results)}"""
            
        except Exception as e:
            return f"Error executing plan: {str(e)}"</code></pre><p>Let&#8217;s look into it step by step (skipping the create_system_prompt method as we already analysed it in the previous part).</p><pre><code>def add_tool(self, tool: Tool) -&gt; None:
    """Register a new tool with the agent."""
    self.tools[tool.name] = tool

def get_available_tools(self) -&gt; List[str]:
    """Get list of available tool descriptions."""
    return [f"{tool.name}: {tool.description}" for tool in self.tools.values()]

def use_tool(self, tool_name: str, **kwargs: Any) -&gt; str:
    """Execute a specific tool with given arguments."""
    if tool_name not in self.tools:
        raise ValueError(f"Tool '{tool_name}' not found. Available tools: {list(self.tools.keys())}")
    
    tool = self.tools[tool_name]
    return tool.func(**kwargs)</code></pre><p>Above contain methods to manage tools:</p><ul><li><p>Attaching tools to the agent.</p></li><li><p>List attached tools.</p></li><li><p>Invoke execution of a tool.</p></li></ul><pre><code>def plan(self, user_query: str) -&gt; Dict:
    """Use LLM to create a plan for tool usage."""
    messages = [
        {"role": "system", "content": self.create_system_prompt()},
        {"role": "user", "content": user_query}
    ]
    
    response = self.client.chat.completions.create(
        model="gpt-4o-mini",
        messages=messages,
        temperature=0
    )
    
    try:
        return json.loads(response.choices[0].message.content)
    except json.JSONDecodeError:
        raise ValueError("Failed to parse LLM response as JSON")</code></pre><p>The above simply executes the system prompt, we defined the expected output as part of the system prompt. It exactly provides the actions that the LLM planned or a direct answer if the tool calling is not needed.</p><pre><code>def execute(self, user_query: str) -&gt; str:
    """Execute the full pipeline: plan and execute tools."""
    try:
        plan = self.plan(user_query)
        
        if not plan.get("requires_tools", True):
            return plan["direct_response"]
        
        # Execute each tool in sequence
        results = []
        for tool_call in plan["tool_calls"]:
            tool_name = tool_call["tool"]
            tool_args = tool_call["args"]
            result = self.use_tool(tool_name, **tool_args)
            results.append(result)
        
        # Combine results
        return f"""Thought: {plan['thought']}
Plan: {'. '.join(plan['plan'])}
Results: {'. '.join(results)}"""
        
    except Exception as e:
        return f"Error executing plan: {str(e)}"</code></pre><p>The above executes the plan method and acts on it. You might remember that the plan can include multiple sequential tool executions, that is why we are looping through planned tool calls.</p><h4>Running the Agent.</h4><p>That&#8217;s it, we have all of the necessary code to create and use the Agent. in the following code we initialise the agent, attach a convert_currency tool to it and loop through two user queries. First one should require the tool use while the second not.</p><pre><code>agent = Agent()
agent.add_tool(convert_currency)

query_list = ["I am traveling to Japan from Serbia, I have 1500 of local currency, how much of Japanese currency will I be able to get?",
                "How are you doing?"]

for query in query_list:
    print(f"\nQuery: {query}")
    result = agent.execute(query)
    print(result)</code></pre><p>The output should be similar to:</p><pre><code>Query: I am traveling to Japan from Serbia, I have 1500 of local currency, how much of Japanese currency will I be able to get?
Thought: I need to convert 1500 Serbian Dinars (RSD) to Japanese Yen (JPY) using the currency conversion tool.
Plan: Use convert_currency tool to convert 1500 RSD to JPY. Return the conversion result
Results: 1500 RSD = 2087.49 JPY

Query: How are you doing?
I'm just a computer program, so I don't have feelings, but I'm here and ready to help you!</code></pre><p>As expected! First query uses the tool, while the second does not.</p><p></p><h4>That&#8217;s it for today, we&#8217;ve learned:</h4><ul><li><p>How to wrap python functions to be provided as tools to the Agent.</p></li><li><p>How to craft a system prompt that uses the tool definitions in planning the execution.</p></li><li><p>How to implement the agent that executes on the plan.</p><div><hr></div></li></ul><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&amp;utm_medium=email&amp;utm_content=share&amp;action=share&amp;token=eyJ1c2VyX2lkIjoxNDEyMjI1OSwicG9zdF9pZCI6MTUxNzczMTkwLCJpYXQiOjE3MzQ3NDY1MTEsImV4cCI6MTczNzMzODUxMSwiaXNzIjoicHViLTExNDQxNzEiLCJzdWIiOiJwb3N0LXJlYWN0aW9uIn0.ozYMMhYkRStOvKkGl3QkPp9i4bSdxZ1NzUbWlfhxm98"><span>Share</span></a></p>]]></content:encoded></item><item><title><![CDATA[AI Clouds and their role in the AI era]]></title><description><![CDATA[And a hands-on project: Mistral-7B powered chatbot on Nebius AI Cloud]]></description><link>https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Fri, 06 Dec 2024 08:04:14 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/2f758def-d71b-4ae7-9548-cd6cc58bde0a_2970x2243.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>With the rise of AI, a new breed of Cloud that specifically offers GPU optimised resources has emerged. It is no surprise as the demand for GPUs has soared. We call them - AI Clouds. In this article you will find:</p><ul><li><p>An overview of the role of AI Cloud in the age of AI.</p></li><li><p>A hands-on project where you will:</p><ul><li><p>Set up a Kubernetes cluster on AI Cloud</p></li><li><p>Deploy an open source LLM (Mistral-7B-Instruct) from HuggingFace using vLLM server on Kubernetes.</p></li><li><p>Build a simple Streamlit based chatbot that uses the previously deployed model endpoint.</p></li><li><p>Expose all of this externally so that you can access the chatbot via your browser.</p></li><li><p>Here is a high level diagram:</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!uh-J!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!uh-J!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 424w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 848w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!uh-J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png" width="618" height="424.0260989010989" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dee4206c-e1f0-41be-931c-d8c9a41a4144_1926x1321.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:999,&quot;width&quot;:1456,&quot;resizeWidth&quot;:618,&quot;bytes&quot;:196499,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!uh-J!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 424w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 848w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!uh-J!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc87a787a-2f64-4e1a-821b-0f5ddb83d91d_1926x1321.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you do run into any problems while following the project, let me know in the comment section or drop me a PM, we will solve it together.</p><div><hr></div><p>This newsletter episode is sponsored by Nebius - an emerging AI Cloud provider. They are currently running a campaign, offering substantial discounts on the first 1000 GPU hours for NVIDIA&#174; H100 Tensor Core GPU instances with a price point of 1.5 USD/GPU hour.  </p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Psk6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png" width="546" height="103.74" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:190,&quot;width&quot;:1000,&quot;resizeWidth&quot;:546,&quot;bytes&quot;:null,&quot;alt&quot;:&quot;Online classes - nebius&quot;,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="Online classes - nebius" title="Online classes - nebius" srcset="https://substackcdn.com/image/fetch/$s_!Psk6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 424w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 848w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1272w, https://substackcdn.com/image/fetch/$s_!Psk6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff6b5cb27-f29b-4fbe-a15e-14119c84a7b9_1000x190.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>You should consider registering if you want to take advantage of the offer. Also, you will need an account if you want to follow the hands-on project that follows later in the article. </p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://nebius.com/explorer-tier?utm_medium=newsletter&amp;utm_source=ag&amp;utm_campaign=explorer-tier&quot;,&quot;text&quot;:&quot;Create an account&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://nebius.com/explorer-tier?utm_medium=newsletter&amp;utm_source=ag&amp;utm_campaign=explorer-tier"><span>Create an account</span></a></p><div><hr></div><h3>What are AI Clouds and when you might consider using one?</h3><p>As mentioned at the beginning of the article, AI Clouds are specifically offering GPU optimised resources, usually with advanced performance configurations so that a user can spin up integrated GPU nodes with ease. There has always been a debate if you should simply use a third party LLM API providers like OpenAI/Anthropic or an AI Cloud like Nebius to serve your own open source models for inference purposes. Let&#8217;s look into the cons and pros of both.</p><p>The LLM application lifecycle has evolved to be complex in the past few years. We went from pre-training of foundation models and then finetuning them for alignment, to building simple RAG (Retrieval Augmented Generation) systems on top of them and eventually evolving them to complex agentic systems capable of more advanced automation.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!0zU8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!0zU8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 424w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 848w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!0zU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png" width="724" height="277.467032967033" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e3ca10ff-5fa9-458c-ad3e-11c53307e858_2926x1122.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:558,&quot;width&quot;:1456,&quot;resizeWidth&quot;:724,&quot;bytes&quot;:191635,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!0zU8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 424w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 848w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 1272w, https://substackcdn.com/image/fetch/$s_!0zU8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb60eb7cd-44c4-46ad-a8ad-d821025b982c_2926x1122.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">LLM Application Lifecycle</figcaption></figure></div><p>If you are training your own foundation LLM or aligning it to human preference, there is no escape from needing a GPU cluster, no proprietary model provider will help you here. Unless you are one of the largest companies on the planet that train foundation models, building your own physical GPU clusters is too capital expensive and out of the question, AI Clouds will help you here in getting competitive prices for interconnected and optimised GPU resources. For a startup that is trying to break into frontier model development it might be the only way of getting the opportunity to play in the game.</p><p>When it comes to finetuning of the models, some of the largest model providers have finetuning functionality on top of their frontier models, but that increases the cost of inferencing even more and adds additional level of &#8220;black box&#8221; on the resulting model. For some low traffic/volume use cases this might be just enough. However, if you are serious about the product you are building with LLMs, you might consider finetuning an open source model like Llama 3.1 and serve the resulting model on an AI Cloud.</p><p>The interesting part of LLM application lifecycle to analyse when choosing between proprietary model APIs and deploying in AI Clouds is when you are building AI powered systems like RAG or Agents, where inference is key. The choice might not always be obvious there. Let&#8217;s delve deeper.</p><p></p><h4>The TCO of LLM inference.</h4><p>One would argue, that deploying and maintaining your own LLMs on an AI Cloud could be a challenge and brings it&#8217;s own additional cost, and that is true. The added cost however is usually fixed and linearly increasing with the amount of GPUs you will need. If you are running distributed workloads, the additional cost of operation (usually human labour) on top of pure GPU costs might be continuously decreasing when divided per GPU operated. Remember, that we are talking about managed GPU clusters and not building your own racks.</p><p>Before evaluating costs of using proprietary model APIs vs. hosting your own LLMs on an AI Cloud, you want to consider viability for your application to run on an open source model in the first place. Nowadays it is usually a valid option as open source models are catching up with frontier models quickly, also RAG or agentic applications do not require extremely powerful model in a wide sense as they are using the models as a reasoning/planning engines rather than a system that can answer any question by default. We can usually do well with smaller but more task focused models. The bellow text assumes that you have estimated the viability of using open source models.</p><p>TCO (total cost of ownership) is an important measure that eventually should drive your decision of running with proprietary LLM APIs or serving your own model on AI Cloud. When we think of the cost of proprietary LLM APIs it is all about input and output tokens. The good thing about these APIs is that we will always (assuming that the API does not go down and you have a paid plan with high enough quotas) be able to get our inference results and the volume can be easily scaled down to zero. The curve of cost over time fluctuates with the traffic to your application.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!tukE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!tukE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!tukE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!tukE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!tukE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!tukE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/43981def-54a6-4503-ac03-919a41947642_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b30d4013-aa9d-4ba6-94b5-192042d1a5c0_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:79616,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!tukE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!tukE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!tukE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!tukE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F43981def-54a6-4503-ac03-919a41947642_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - Proprietary model APIs</figcaption></figure></div><p>On the other hand, if you are deploying your own models on an AI Cloud, you will need a dedicated set of resources that are either always running or are shut down. In the simplest scenario, the cost curve over time is flat. Of course this is a rare case in a production application as you will want to implement some sort of autoscaling to bring your cost per token down and the users of your application happy with the responsiveness of the app.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!thnE!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!thnE!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!thnE!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!thnE!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!thnE!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!thnE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9651ef6e-c681-4fb2-a563-b3f3be05c433_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:60010,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!thnE!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!thnE!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!thnE!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!thnE!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa28cea1f-2a9b-41f9-8116-73ffc1d8ede8_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - AI Cloud</figcaption></figure></div><h4><em>Throughput, latency and autoscaling in AI Clouds.</em></h4><p>The important part to evaluate is what kind of throughput and latency levels your application will require:</p><ul><li><p>Throughput - how many requests per time interval the system can handle (usually measured in requests per second).</p></li><li><p>Latency - how fast the application is able to return the first relevant point of data. It is unique for LLM based applications, especially chatbots, as the first relevant point is usually considered to be the first returned inference token, the following tokens can be generated and rendered faster than an average human can read. The important measure here is TTFT - time to first token. So in order to reduce TTFT, we aim to reduce the time of <em>Prefill, </em>the procedure includes all operations that LLM needs to be performed before starting to generate tokens.</p></li></ul><p>Not surprisingly, throughput can be increased and decreased by throwing more wood to the fire - increasing the size of your GPU cluster which is being used to serve the models. In some cases it is enough to scale the cluster horizontally, where the same model would be deployed on separate machines to serve more traffic, in some cases there is a need to deploy the same model using more GPUs.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!huu8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!huu8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!huu8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!huu8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!huu8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!huu8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/83955846-400a-481f-9d82-94630a74071a_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:69279,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!huu8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!huu8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!huu8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!huu8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffb693356-d5c2-44f7-8564-1c1a44d9b991_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - AI Cloud with Autoscaling</figcaption></figure></div><p>In any case, it is worth pre-calculating the amount of GPUs you will need and do some estimation of the traffic that needs to be supported throughout the time - less traffic means less throughput needed. Knowing your approximate traffic needs, you can implement Upscaling and Downscaling strategies that would allow managing the TCO more efficiently.</p><p>Additional advantage of running on AI Cloud is that you would be able to control the throughput and latency to a granular level. This also means that in most cases, with enough tuning,  you would be able to achieve lower latencies compared to when running with proprietary model API. Also, you can deploy other relevant services closer to models being served, reducing the end-to-end latency even more as the network traffic can be contained within a single region.</p><p>An interesting stage in the process of building LLM apps is prototyping, this would mean no need to handle user traffic. While you can easily scale down to zero usage with proprietary APIs, you can also do that in AI Clouds by scaling the cluster size to zero during the periods of non-activity.</p><p></p><h4><em>TOC - Proprietary model APIs vs. AI Clouds</em></h4><p>Let&#8217;s see how the cost curves of Proprietary model APIs and AI Clouds could potentially look when overlayed on top of each other.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!F4cH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!F4cH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!F4cH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b9354a96-972a-4119-9555-09224da36b32_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ca3c702b-8261-4d2e-83bc-60fcc7ab8c32_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:86653,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!F4cH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!F4cH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb9354a96-972a-4119-9555-09224da36b32_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - Proprietary model APIs vs. AI Cloud</figcaption></figure></div><p>The math is obvious. You are better off with an AI Cloud in green areas and you are better of with proprietary LLM API providers in the red areas. If you consider using an AI Cloud your goal should be to stay in the green as much as possible (when it comes to optimising TOC).</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!JtMh!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!JtMh!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!JtMh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/19dced9b-bb9b-4f8e-a595-016f75c2e730_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:107262,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!JtMh!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!JtMh!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd7615820-01a2-43a5-9b27-3d66b08e3d67_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - Proprietary model APIs vs AI CLoud</figcaption></figure></div><p>The truth of the matter is that if you are running a LLM application at scale with high throughput requirements, it is not hard to have your costs lower when using AI Cloud compare to a proprietary API. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Uh8E!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Uh8E!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Uh8E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png" width="620" height="431.3598901098901" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0bd9726-da6f-4918-9d85-8f79eb97462a_1747x1215.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1013,&quot;width&quot;:1456,&quot;resizeWidth&quot;:620,&quot;bytes&quot;:86500,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Uh8E!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 424w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 848w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 1272w, https://substackcdn.com/image/fetch/$s_!Uh8E!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbc9d17a8-0fa9-45b1-a53d-aa6e28273c9c_1747x1215.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">TOC - Proprietary model APIs vs. AI Cloud (At scale)</figcaption></figure></div><h4>To summarise.</h4><ul><li><p>If you are pre-training or finetuning, AI Cloud is your best friend.</p></li><li><p>If you are just prototyping a LLM based application it might be a good idea to start off with proprietary model APIs to prove the viability of you application in the first place.</p></li><li><p>After proving the case, oftentimes the next step is to cut the costs. It might involve switching to AI Cloud and deploying your own open source models. It also gives you higher control and additional capabilities to tune the system.</p></li><li><p>Understanding your TOC is extremely important when choosing where to get your LLM inference tokens from.</p></li></ul><p>Now, let&#8217;s go build a chatbot and run it on an Open Source LLM deployed on Nebius.</p><p></p><h3>A hands-on project. Let&#8217;s go build! </h3><p>If you haven&#8217;t yet, be sure to register to Nebius Cloud so you can follow along:</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://nebius.com/explorer-tier?utm_medium=newsletter&amp;utm_source=ag&amp;utm_campaign=explorer-tier&quot;,&quot;text&quot;:&quot;Create an account&quot;,&quot;action&quot;:null,&quot;class&quot;:&quot;button-wrapper&quot;}" data-component-name="ButtonCreateButton"><a class="button primary button-wrapper" href="https://nebius.com/explorer-tier?utm_medium=newsletter&amp;utm_source=ag&amp;utm_campaign=explorer-tier"><span>Create an account</span></a></p><p></p><p>The end goal of the project is the following:</p><ul><li><p>A Kubernetes cluster running on Nebius Cloud with a GPU compute worker node. We will use NVIDIA&#174; H100 Tensor Core GPU instance that is now available for 1.5 USD/hour.</p></li><li><p>Deploy a vLLM server with Mistral-7b-instruct LLM as a pod. We will use the previously deployed K8s server for this.</p></li><li><p>Write a very simple chatbot with Streamlit that will provide an easy interface to test the model.</p></li><li><p>Deploy the chatbot as a container in Kubernetes.</p></li><li><p>Expose it via a LoadBalancer service so that you can access it through the internet. </p></li></ul><p></p><h4>Step 1: Deploy the Kubernetes cluster.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vB9t!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vB9t!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 424w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 848w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 1272w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vB9t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png" width="438" height="365.9551401869159" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d0485280-aa28-47fa-9242-e9d30fbdfdeb_1070x894.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:894,&quot;width&quot;:1070,&quot;resizeWidth&quot;:438,&quot;bytes&quot;:71733,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vB9t!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 424w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 848w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 1272w, https://substackcdn.com/image/fetch/$s_!vB9t!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F74a2b2b5-9482-4991-aee6-e71e1d3d3e45_1070x894.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Step 1 goal.</figcaption></figure></div><p>Before you can start the deployment, you will need to install Nebius CLI tool. If you already have your account set up it should be straightforward.</p><p>Assuming that you are running on MacOS, run:</p><pre><code>brew install jq
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/darwin/arm64/kubectl"
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl
sudo chown root: /usr/local/bin/kubectl
curl -sSL https://storage.ai.nebius.cloud/nebius/install.sh | bash</code></pre><p>This will set you up with kubectl tool for communication with Kubernetes cluster and nebius CLI tool for authentication with. After this run:</p><pre><code>nebius profile create</code></pre><p>It will prompt you for:</p><ul><li><p>Name - enter any.</p></li><li><p>Api endpoint - leave the default <em>api.eu.nebius.cloud.</em></p></li><li><p>Authorisation type - choose <em>federation.</em></p></li></ul><p>After the above you will be redirected to the browser window where you will be authenticated. You now have your Nebius CLI tool set up.</p><p>If anything fails at this point, refer to official Nebius documentation here: <a href="https://docs.nebius.com/kubernetes/quickstart/#env-install">Link</a>.</p><p><em>Disclaimer: to follow the following tutorial you will need to have your billing set up and as mentioned before, a single NVIDIA&#174; H100 Tensor Core GPU node would cost 1.5 USD/hour, there is a small charge for the public IP addresses assigned in steps 3 and 4 of the project. Once done, be sure to clean up your resources to not incure additional charges.</em></p><p>For simplicity reasons, we will perform the deployment via the Nebius Cloud UI, here is how you can do it. It is easy to figure it out, but let&#8217;s run step-by-step so that there are no unanswered questions.</p><ul><li><p>Once you log in to the console, click on the <em>&#8220;Managed Kubernetes&#8221; </em>tab on the left and click <em>&#8220;+ Create cluster&#8221;</em> top right</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!c2Fe!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!c2Fe!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 424w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 848w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 1272w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!c2Fe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png" width="1126" height="487" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:487,&quot;width&quot;:1126,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:214293,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!c2Fe!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 424w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 848w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 1272w, https://substackcdn.com/image/fetch/$s_!c2Fe!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa891b138-df63-4e54-b989-7b8ca6f6b153_1126x487.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Once in the cluster creation section:</p><ul><li><p>Provide the cluster name.</p></li><li><p>We switch off the <em>Control plane high availability</em> off as there is no need for it in a demo project, be sure to always have it turned on for production use cases.</p></li><li><p>Let&#8217;s have the <em>Public endpoint</em> on as it will make configuration of kubectl easier for this example.</p></li><li><p>Press <em>&#8220;Create cluster&#8221; </em>once configuration is complete.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!FGvt!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!FGvt!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 424w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 848w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 1272w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!FGvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png" width="1020" height="487" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:487,&quot;width&quot;:1020,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:182742,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!FGvt!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 424w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 848w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 1272w, https://substackcdn.com/image/fetch/$s_!FGvt!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F59f151f8-c7c1-42fd-94ba-ccfcdb93c416_1020x487.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>You will see a new cluster being provisioned in the <em>&#8220;Managed Kubernetes&#8221;</em> overview tab. Click on it.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xATW!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xATW!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 424w, https://substackcdn.com/image/fetch/$s_!xATW!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 848w, https://substackcdn.com/image/fetch/$s_!xATW!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 1272w, https://substackcdn.com/image/fetch/$s_!xATW!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xATW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png" width="1204" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1204,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:287409,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xATW!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 424w, https://substackcdn.com/image/fetch/$s_!xATW!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 848w, https://substackcdn.com/image/fetch/$s_!xATW!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 1272w, https://substackcdn.com/image/fetch/$s_!xATW!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff7db4678-ab53-4b99-bc99-cf8e376c37ec_1204x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>The above steps have provisioned a control plane for K8s, now we need to add some worker nodes. Click on the &#8220;<em>Node groups&#8221; </em>tab.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ia9h!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ia9h!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 424w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 848w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ia9h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png" width="1200" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1200,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:260268,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ia9h!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 424w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 848w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Ia9h!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50c2cfdc-a7dc-4182-84c7-311ffc28a70f_1200x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Click <em>&#8220;+ Create new group&#8221;</em>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!7GRx!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!7GRx!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 424w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 848w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 1272w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!7GRx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png" width="1456" height="511" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:511,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:221492,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!7GRx!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 424w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 848w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 1272w, https://substackcdn.com/image/fetch/$s_!7GRx!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F001a4cb4-e925-4d4e-9fc5-f15baa6ec9fb_1460x512.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Configure the Node Group for our project:</p><ul><li><p>Give it a <em>Name</em>.</p></li><li><p>Disable the <em>Public IPv4 address </em>as we will be using LoadBalancer services to expose our apps, no need for the node itself to have a public address.</p></li><li><p>Reduce the <em>Number of nodes</em> to 1 as we don&#8217;t need more for this project.</p></li><li><p>Choose the <em>Platform</em> correctly: we want <em>NVIDIA&#174; H100 NVLink with Intel Sapphire Rapids </em>GPU nodes as these are the ones we are getting discounts on from Nebius.</p></li><li><p>In the <em>Preset</em> field be sure to select 1 GPU as for serving purposes we don&#8217;t need more.</p></li></ul></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Xpy6!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 424w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 848w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png" width="969" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:969,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:176004,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Xpy6!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 424w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 848w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1272w, https://substackcdn.com/image/fetch/$s_!Xpy6!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F143f6170-99fd-416e-9f88-a3d6b3e216de_969x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Once the Cluster and the Nodes have been successfully provisioned, we go to the <em>Applications</em> tab and install the NVIDIA GPU Operator. This is very important and takes a lot of work from the users plate - setting up K8s clusters so that they can properly integrate with the GPUs is not an easy task, the set of applications deployed in this step takes care of all of it for us via one click.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!_Sz-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!_Sz-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 424w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 848w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 1272w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!_Sz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png" width="1371" height="538" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:538,&quot;width&quot;:1371,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:278889,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!_Sz-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 424w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 848w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 1272w, https://substackcdn.com/image/fetch/$s_!_Sz-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9c4fa87d-c9b6-4fc3-ab8a-c8f0240e2c44_1371x538.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>Click <em>&#8220;Install application&#8221;</em> on the next screen.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qRRu!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qRRu!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 424w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 848w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 1272w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qRRu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png" width="1456" height="530" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:530,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:170736,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qRRu!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 424w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 848w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 1272w, https://substackcdn.com/image/fetch/$s_!qRRu!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9277d2e9-485d-4274-84f0-6342ccb97937_1460x531.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><ul><li><p>That&#8217;s it, we are ready to connect to our new Kubernetes cluster. Click on the <em>&#8220;How to connect&#8221; </em>button, copy the third command and run it. We are good to go to use kubectl and communicate with the cluster. Let&#8217;s try.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!rzQI!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!rzQI!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 424w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 848w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 1272w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!rzQI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png" width="1456" height="636" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:636,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:186618,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!rzQI!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 424w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 848w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 1272w, https://substackcdn.com/image/fetch/$s_!rzQI!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F161ebcc1-fd05-412d-b34d-06bfc3cfdd3a_1459x637.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Run the following in the terminal</p><pre><code><code>kubectl get pods</code></code></pre><p>you should see something similar to</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!B_La!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!B_La!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 424w, https://substackcdn.com/image/fetch/$s_!B_La!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 848w, https://substackcdn.com/image/fetch/$s_!B_La!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 1272w, https://substackcdn.com/image/fetch/$s_!B_La!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!B_La!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png" width="1456" height="492" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:492,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:134170,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!B_La!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 424w, https://substackcdn.com/image/fetch/$s_!B_La!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 848w, https://substackcdn.com/image/fetch/$s_!B_La!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 1272w, https://substackcdn.com/image/fetch/$s_!B_La!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe91beaf3-2c70-441c-89b5-017c1cf6dbf1_1468x496.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>If you do, that&#8217;s great, all of the pods are deployed by the NVIDIA operator installed in the previous step. Disregard the on in a pending state, it is not important for this project.</p><p>In production projects you should consider deploying all of your infrastructure using Terraform.</p><p></p><h4>Step 2: Deploy Mistral-7B-Instruct via vLLM and expose it outside of the cluster.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!eFCy!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!eFCy!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 424w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 848w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 1272w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!eFCy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png" width="552" height="420.63157894736844" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/f3be699e-6342-418b-94f9-e01e11b35fd2_1311x999.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:999,&quot;width&quot;:1311,&quot;resizeWidth&quot;:552,&quot;bytes&quot;:114872,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!eFCy!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 424w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 848w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 1272w, https://substackcdn.com/image/fetch/$s_!eFCy!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbf760f68-6227-49ff-b81f-51ac1f68ec7a_1311x999.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Step 2 goal.</figcaption></figure></div><p>Create the 3 following files:</p><p><em>hf-secret.yaml</em> </p><pre><code>apiVersion: v1
kind: Secret
metadata:
  name: hf-token-secret
  namespace: default
type: Opaque
data:
  token: "&lt;Your HuggingFace access token to access gated models (base 64 encoded)&gt;"</code></pre><ul><li><p>The above creates a secret that will be used by vLLM to pull gated models from HuggingFace - mistralai/Mistral-7B-Instruct-v0.3 is such.</p></li></ul><p><em>mistral-deployment.yaml</em></p><pre><code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: mistral-7b
  namespace: default
  labels:
    app: mistral-7b
spec:
  replicas: 1
  selector:
    matchLabels:
      app: mistral-7b
  template:
    metadata:
      labels:
        app: mistral-7b
    spec:
      volumes:
      - name: cache-volume
        emptyDir: {}
      - name: shm
        emptyDir:
          medium: Memory
          sizeLimit: "2Gi"
      containers:
      - name: mistral-7b
        image: vllm/vllm-openai:latest
        command: ["/bin/sh", "-c"]
        args: [
          "vllm serve mistralai/Mistral-7B-Instruct-v0.3 --trust-remote-code --enable-chunked-prefill --max_num_batched_tokens 1024"
        ]
        env:
        - name: HUGGING_FACE_HUB_TOKEN
          valueFrom:
            secretKeyRef:
              name: hf-token-secret
              key: token
        ports:
        - containerPort: 8000
        resources:
          limits:
            cpu: "10"
            memory: 20G
            nvidia.com/gpu: "1"
          requests:
            cpu: "2"
            memory: 6G
            nvidia.com/gpu: "1"
        volumeMounts:
        - mountPath: /root/.cache/huggingface
          name: cache-volume
        - name: shm
          mountPath: /dev/shm
        readinessProbe:
          httpGet:
            path: /health
            port: 8000
          initialDelaySeconds: 60
          periodSeconds: 5</code></pre><ul><li><p><strong>[IMPORTANT]:</strong> Be sure to not run default vLLM Kubernetes deployment examples from their docs as they have Liveness Probes that are misconfigured for long starting pods. It will take ~30 minutes for this pod to become operational. If you would do that, the pod would go into an endless crash loop.</p></li></ul><p><em>services.yaml</em></p><pre><code>apiVersion: v1
kind: Service
metadata:
  name: mistral-7b
  namespace: default
spec:
  ports:
  - name: http-mistral-7b
    port: 80
    protocol: TCP
    targetPort: 8000
  selector:
    app: mistral-7b
  sessionAffinity: None
  type: ClusterIP

---

apiVersion: v1
kind: Service
metadata:
  name: mistral-7b-lb
  namespace: default
spec:
  ports:
  - name: http-mistral-7b
    port: 80
    protocol: TCP
    targetPort: 8000
  selector:
    app: mistral-7b
  sessionAffinity: None
  type: LoadBalancer</code></pre><ul><li><p>We create two services here:</p><ul><li><p>ClusterIP - for exposure inside of the cluster.</p></li><li><p>LoadBalancer - for exposing the service via public IP so that we can develop against it from our machines.</p></li></ul></li></ul><p>Run them one by one:</p><pre><code>kubectl apply -f hf-secret.yaml</code></pre><pre><code>kubectl apply -f mistral-deployment.yaml</code></pre><pre><code>kubectl apply -f services.yaml</code></pre><p>As mentioned before, it will take somewhat around ~30 minutes for the vLLM pod to start up since we are pulling the <em>vllm/vllm-openai </em>container which is around 5 GB in size, after that is done we are also downloading <em>mistralai/Mistral-7B-Instruct-v0.3 </em>weights that are around 14 GB. Through time the state of the pod will move from <em>ContainerCreating:</em></p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!e6nU!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!e6nU!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 424w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 848w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 1272w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!e6nU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png" width="820" height="86" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:86,&quot;width&quot;:820,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14507,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!e6nU!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 424w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 848w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 1272w, https://substackcdn.com/image/fetch/$s_!e6nU!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a51b7cc-11d4-46c1-9729-340802c5e92a_820x86.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>To <em>Running </em>but not ready:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Y725!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Y725!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 424w, https://substackcdn.com/image/fetch/$s_!Y725!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 848w, https://substackcdn.com/image/fetch/$s_!Y725!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 1272w, https://substackcdn.com/image/fetch/$s_!Y725!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Y725!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png" width="774" height="88" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:88,&quot;width&quot;:774,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:14088,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Y725!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 424w, https://substackcdn.com/image/fetch/$s_!Y725!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 848w, https://substackcdn.com/image/fetch/$s_!Y725!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 1272w, https://substackcdn.com/image/fetch/$s_!Y725!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4ad71fe8-864f-43fa-8a14-222507423d9f_774x88.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Here we will need patience and wait it out until the container doe become ready. While we are waiting, let&#8217;s find the public IP that we will use to interact with the LLM server. Run</p><pre><code>kubectl get services</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xhKp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xhKp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 424w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 848w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 1272w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xhKp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png" width="728" height="151.66666666666666" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/671713f2-7045-42ae-92f0-52369434518e_648x135.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:135,&quot;width&quot;:648,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:26126,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xhKp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 424w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 848w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 1272w, https://substackcdn.com/image/fetch/$s_!xhKp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F671713f2-7045-42ae-92f0-52369434518e_648x135.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>We are interested in the value of <em>External-IP, </em>in my case <em>195.242.13.1 , </em>let&#8217;s prepare a simple curl to test if the model has been exposed successfully.</p><pre><code>curl &lt;Your IP Address&gt;/v1/completions \
  -H "Content-Type: application/json" \
  -d '{
        "model": "mistralai/Mistral-7B-Instruct-v0.3",
        "prompt": "We are building a chatbot, what are you up to?",
        "max_tokens": 500,
        "temperature": 0
      }'</code></pre><p>Here is my answer:</p><blockquote><p>I'm a model and don't have personal experiences or activities. I'm here to help you with your questions and tasks related to building a chatbot. Let's get started! What specific questions or issues do you have regarding your chatbot project?\n\nI'm new to chatbot development, where should I</p></blockquote><p>Good thing we stoped it at max tokens - 500 ;)</p><p>Nice, we have exposed the Mistral LLM publicly from a K8s cluster that is within Nebius platform. The model is also deployed on a GPU!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h4>Step 3: Developing a chatbot locally that would communicate with the previously exposed model endpoint.</h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d_eG!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d_eG!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 424w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 848w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 1272w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d_eG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png" width="550" height="476.58276125095347" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/2872c6d6-b876-4d30-a66b-4251e668bd9e_1311x1136.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1136,&quot;width&quot;:1311,&quot;resizeWidth&quot;:550,&quot;bytes&quot;:136966,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d_eG!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 424w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 848w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 1272w, https://substackcdn.com/image/fetch/$s_!d_eG!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4e71b422-3689-4f5c-9efe-38c1c3ab636e_1311x1136.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Step 3 goal.</figcaption></figure></div><p>For simplicity reasons, we will implement the chatbot using Streamlit. I have prepared a script that will do the trick, feel free to copy the code into a new file, e.g. <em>streamlit_app.py</em>:</p><pre><code>from openai import OpenAI
import streamlit as st

st.title("Our test Chatbot")

client = OpenAI(
    base_url="http://&lt;Your IP address&gt;/v1",
    api_key="abc"
)

if "messages" not in st.session_state:
    st.session_state["messages"] = [{"role": "assistant", "content": "Hello! How can I assist you today?"}]

for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

if prompt := st.chat_input("Hello! How can I assist you today?"):
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)

    with st.chat_message("assistant"):
        message_history = ""
        for m in st.session_state.messages:
            message_history += f"role: {m['role']}, content: {m['content']}\n"
        stream = client.chat.completions.create(
            model="mistralai/Mistral-7B-Instruct-v0.3",
            messages=[
                {"role": "system", "content": f"You are a chatbot built on top o Mistral 7B Large Language Model, here is our message history: \n {message_history}"},
                {"role": "user", "content": prompt}
            ],
            stream = True,
            max_tokens=500
        )
        response = st.write_stream(stream)
    st.session_state.messages.append({"role": "assistant", "content": response})</code></pre><p>Few points on the script:</p><ul><li><p>Don&#8217;t forget to input your public IP address of the exposed model in <em>line 7</em>.</p></li><li><p>We are implementing thread memory ourselves by concatenating the chat history and inputing it into the prompt each time we are calling the API.</p></li><li><p>We are using official openai library to communicate with our model - it has been deployed as an OpenAI compatible API.</p></li></ul><p>We will need just a few dependencies here, install them via running</p><pre><code>pip install streamlit openai</code></pre><p>Once you have the dependencies, you can start the streamlit app by running</p><pre><code>streamlit run streamlit_app.py</code></pre><p>Given you named your python file <em>streamilt_app.py</em>.</p><p>Great work! If you followed the project successfully so far, you should see something like this:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!UTYv!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!UTYv!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 424w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 848w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 1272w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!UTYv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png" width="1456" height="1464" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1464,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:909654,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!UTYv!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 424w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 848w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 1272w, https://substackcdn.com/image/fetch/$s_!UTYv!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8b3f0b64-f04b-45ef-bbeb-140dadc30313_1728x1738.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>As you can see, I already conversed with the bot, give it a try. Pretty cool, we now have a chat interface to communicate with our Mistral model and it even has a very simple memory implementation.</p><p></p><h4>Step 4: Package the Streamlit application as a Docker container and expose it from within the Kubernetes cluster. </h4><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pbbl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pbbl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 424w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 848w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pbbl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png" width="556" height="381.4862637362637" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/96f27015-b93a-410a-a008-fe9f62e31e42_1926x1321.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:999,&quot;width&quot;:1456,&quot;resizeWidth&quot;:556,&quot;bytes&quot;:196499,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pbbl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 424w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 848w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 1272w, https://substackcdn.com/image/fetch/$s_!pbbl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffe8138b6-cbe4-44f8-8bac-57c67e84d064_1926x1321.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Step 4 goal.</figcaption></figure></div><p>Nice progress! we don&#8217;t want to have to run the app locally each time though, so we will package it and deploy it to the same cluster where the Mistral model is deployed.</p><p>The app will be deployed inside of the cluster, so we will be reusing he internal service that we have created in <em>Step 2 </em>of the project. For this to work, we will need to change the url used in our streamlit application. Create a new folder with the following files in it:</p><p><em>requirements.txt</em></p><pre><code>streamlit
openai</code></pre><p><em>streamlit_app.py</em> </p><pre><code>from openai import OpenAI
import streamlit as st

st.title("Our test Chatbot")

client = OpenAI(
    base_url="http://mistral-7b.default.svc.cluster.local/v1",
    api_key="abc"
)

if "messages" not in st.session_state:
    st.session_state["messages"] = [{"role": "assistant", "content": "Hello! How can I assist you today?"}]

for message in st.session_state.messages:
    with st.chat_message(message["role"]):
        st.markdown(message["content"])

if prompt := st.chat_input("Hello! How can I assist you today?"):
    st.session_state.messages.append({"role": "user", "content": prompt})
    with st.chat_message("user"):
        st.markdown(prompt)

    with st.chat_message("assistant"):
        message_history = ""
        for m in st.session_state.messages:
            message_history += f"role: {m['role']}, content: {m['content']}\n"
        stream = client.chat.completions.create(
            model="mistralai/Mistral-7B-Instruct-v0.3",
            messages=[
                {"role": "system", "content": f"You are a chatbot built on top o Mistral 7B Large Language Model, here is our message history: \n {message_history}"},
                {"role": "user", "content": prompt}
            ],
            stream = True,
            max_tokens=500
        )
        response = st.write_stream(stream)
    st.session_state.messages.append({"role": "assistant", "content": response})</code></pre><ul><li><p>Take note of the change in the line 7. We are pointing to internal K8s service now.</p></li></ul><p><em>Dockerfile</em></p><pre><code><code>FROM python:3.9

WORKDIR /code/src

COPY ./requirements.txt /code/requirements.txt

RUN pip install --no-cache-dir --upgrade -r /code/requirements.txt

COPY ./ /code/src

EXPOSE 8501

CMD ["streamlit", "run", "streamlit_app.py", "--server.port", "8501", "--server.address", "0.0.0.0"]</code></code></pre><p>Next, we will package the application as docker container and use Docker Hub to store and retrieve the container.</p><p>Log into your Docker Hub account and create a new repository, I called mine <em>chatbot-demo</em>. the full tag of the docker image that I used was <em>aurimasg/chatbot-demo:0.1</em> which refers to <em>&lt;docker_hub_account&gt;/&lt;repository&gt;:&lt;tag&gt;.</em> Wherever I will refer to this notation, you will need to replace it with your own data.</p><p>Assuming that you have docker properly configured and authenticated, run:</p><pre><code>docker build . --platform linux/amd64 -t &lt;docker_hub_account&gt;/&lt;repository&gt;:&lt;tag&gt;</code></pre><p>Once build finishes, run:</p><pre><code>docker push &lt;docker_hub_account&gt;/&lt;repository&gt;:&lt;tag&gt;</code></pre><p>Great, you now have a publicly available docker image to be used as your streamlit application, and it can be deployed on the Kubernetes cluster in Nebius cloud.</p><p>Let&#8217;s create the deployment and service to expose it. Create the following files:</p><p><em>chatbot-deployment.yaml</em></p><pre><code>apiVersion: apps/v1
kind: Deployment
metadata:
  name: chatbot
  labels:
    app: chatbot
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: chatbot
  template:
    metadata:
      labels:
        app: chatbot
    spec:
      containers:
      - name: chatbot
        image: aurimasg/chatbot-demo:0.5
        imagePullPolicy: Always
        ports:
        - containerPort: 8501</code></pre><p><em>chatbot-service.yaml</em></p><pre><code>apiVersion: v1
kind: Service
metadata:
  name: chatbot-lb
  namespace: default
spec:
  ports:
  - name: chatbot
    port: 80
    protocol: TCP
    targetPort: 8501
  selector:
    app: chatbot
  sessionAffinity: None
  type: LoadBalancer</code></pre><p><strong>[IMPORTANT]: </strong>You will only have a quota of one public IP address to be used by LoadBalancer so you will need to delete the LoadBalancer service created in the Step 2 of the project before creating the new one. You can do that by running</p><pre><code>kubectl delete service mistral-7b-lb</code></pre><p>Note, that after doing that you will lose access to the model endpoint and your local stramlit application will stop working.</p><p>Having said that, let&#8217;s now apply the deployment and service manifests, run:</p><pre><code>kubectl apply -f chatbot-deployment.yaml</code></pre><pre><code>kubectl apply -f chatbot-service.yaml</code></pre><p>Check the deployment and service:</p><pre><code>kubectl get services</code></pre><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!vdGK!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!vdGK!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 424w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 848w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 1272w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!vdGK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png" width="728" height="151.6218611521418" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:141,&quot;width&quot;:677,&quot;resizeWidth&quot;:728,&quot;bytes&quot;:25576,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!vdGK!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 424w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 848w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 1272w, https://substackcdn.com/image/fetch/$s_!vdGK!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F9e6b3a5b-a0cc-423c-842d-c0c38d22b7a6_677x141.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>If all went according to plan, you should see the <em>External-IP</em> successfully mounted to the service. Go on and try it out, enter it in your browser - in my case http://195.242.10.255.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!niBg!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!niBg!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 424w, https://substackcdn.com/image/fetch/$s_!niBg!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 848w, https://substackcdn.com/image/fetch/$s_!niBg!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 1272w, https://substackcdn.com/image/fetch/$s_!niBg!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!niBg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png" width="1456" height="1327" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1327,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:225751,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!niBg!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 424w, https://substackcdn.com/image/fetch/$s_!niBg!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 848w, https://substackcdn.com/image/fetch/$s_!niBg!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 1272w, https://substackcdn.com/image/fetch/$s_!niBg!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa44143d5-3358-42b0-98a3-616edb47e43f_2038x1858.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>And this concludes our project, we learned how to:</p><ul><li><p>Spin up and configure Kubernetes cluster with GPU support on Nebius Clooud.</p></li><li><p>Deploy a vLLM server exposing an open source LLM API (using GPUs).</p></li><li><p>Expose the service outside of the cluster.</p></li><li><p>Build a simple chatbot using Streamlit to utilize this LLM endpoint.</p></li><li><p>Package and deploy the chatbot application as a deployment in Kubernetes.</p></li><li><p>Expose it via a public IP.</p><p></p></li></ul><h4>Disclaimer.</h4><p>This is just a simple demo project, if you want to make it production ready, there are many things to consider.</p><ul><li><p>High availability of Kubernetes cluster.</p></li><li><p>Monitoring of your application.</p></li><li><p>Building Docker images in a secure way.</p></li><li><p>Securing the publicly exposed endpoint. Pla</p></li><li><p>Rate limiting to of your application.</p></li><li><p>Horizontal scalability of both the LLM API and the Streamlit application.</p></li><li><p>&#8230;</p></li></ul><p>Let me know if you had any issues while following the project! Hope to see you in the next article :)</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/ai-clouds-and-their-role-in-the-ai?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p>]]></content:encoded></item><item><title><![CDATA[What is AI Engineering?]]></title><description><![CDATA[And what you need to break into the role.]]></description><link>https://www.newsletter.swirlai.com/p/what-is-ai-engineering</link><guid isPermaLink="false">https://www.newsletter.swirlai.com/p/what-is-ai-engineering</guid><dc:creator><![CDATA[Aurimas Griciūnas]]></dc:creator><pubDate>Sat, 30 Nov 2024 10:36:00 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/df15f15b-3871-489b-83f4-7c8c38df9f6f_2709x2402.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>&#128075; <em>I am <a href="https://www.linkedin.com/in/aurimas-griciunas">Aurimas</a>. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.</em></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">SwirlAI Newsletter is a reader-supported publication. To receive new posts and support my work, consider becoming a subscriber.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><div><hr></div><p>Recently there has been a lot of buzz around AI Engineering - AGAIN :) . I was a bit surprised as I had an intuition that throughout the last 2 years, after the hype of LLMs, the definition of the role would have settled by now. Also, working in ML infrastructure space, the discussions around AI Engineer ICP were often happening and it seemed clear what kind of profile we were talking about. I guess I was living in a bubble! </p><p>In this article I will outline my thoughts around the role of AI Engineer and how it evolved in the recent year. My goal is to make part of SwirlAI a one-stop shop for anyone who wants to break into the role or upskill as AI Engineer. Lets go!</p><p>A short outline of the article:</p><ul><li><p>Evolution of AI Systems in the age of LLMs.</p></li><li><p>How is AI Engineering different from Machine Learning or Software Engineering?</p></li><li><p>What skills would you need to break into the role?</p></li><li><p>What is the future of AI Engineering?</p></li></ul><p></p><h3>Evolution of AI Systems in the age of LLMs.</h3><p>My take is simple (and it might be perceived controversial to some). AI Systems did not change that much, the thing that has is that we now have LLMs that allow us to solve some additional complex tasks - as steps in the AI Systems pipeline - that we previously could not have. In general, LLM in an extremely versatile tool in your day-to-day, but what are the new key capabilities when it comes to building AI systems?</p><ul><li><p>Planning.</p></li><li><p>Content extraction.</p></li><li><p>Content generation.</p></li><li><p>Code generation.</p></li></ul><p>That is pretty much it. Is it powerful? Hell yes! Especially when combined with regular software and machine learning. </p><p>An AI System remains, well, a system of multiple components, some of them can be LLMs, some of them - code executions, some of them - just a good old classification model. Having said that, it is worth exploring how these new types of applications that are the synthesis of old and new have evolved, as AI Engineers are closely tied to their development and deployment.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!3jWM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!3jWM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 424w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 848w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 1272w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!3jWM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png" width="1456" height="368" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/aefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/b0a00122-0564-4b25-b8e6-e3e106849a6f_2926x739.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:368,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:150895,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!3jWM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 424w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 848w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 1272w, https://substackcdn.com/image/fetch/$s_!3jWM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Faefd6ef5-2550-42ff-b6ad-b691d48123a9_2926x739.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Evolution of LLM based AI Systems</figcaption></figure></div><p>If it was up to me I would rather use term AI Systems Engineer than AI Engineer as it would ground the term into what building a successful AI product is actually about. There are two main Engineering related roles involved in building out the end result, which is a productionised AI System:</p><ul><li><p><strong>AI Researchers</strong> - they are building in the initial stage (marked as <em>1. </em>in the picture above) of LLM application lifecycle. Either pre or post-training the LLMs that would later be used in AI systems. In recent months more and more emphasis is being put on post-training phase as it seems we are finally reaching the limits of the Scaling Law for LLMs. Out of the research in post-trining we got products like OpenAI o1. The usefulness of it is still limited but with reduction in latency it could significantly enhance the systems that are built on top of such &#8220;reasoning&#8221; models.</p></li><li><p><strong>AI Engineers</strong> - they are building AI Systems (marked as <em>2. </em>in the picture above) leveraging pre-trained LLMs to solve real business problems. In past years the practice has evolved from building simple applications that are based on single <em>prompt &#8594; answer</em> architecture to more complex Retrieval Augmented Generation systems to then Agentic RAG and eventually to Agents that are capable of more than just answering questions. In the next two years we should be able to reliably deploy more complex multi-agent systems and maybe (more likely end of 2026 or year 2027) fully autonomous agentic systems that would require little guard-railing from human operators.</p></li></ul><p>There is some overlap between what AI Researchers and AI Engineers would work on in their day-to-day if we consider the common definition of roles (<em>point 3.</em>). The field is evolving fast. As mentioned before, more and more research is being poured into the post-training process, it is yet to be seen how much of involvement AI Engineers will have in that, but it seems that it might move more to the side of AI Researchers. The goal of AI Engineer is to take what is already available and stitch up an AI system that would solve the business problem. That does not mean that you don&#8217;t need to fine tune a LLM from time to time. </p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!OoYq!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!OoYq!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 424w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 848w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 1272w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!OoYq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png" width="1456" height="955" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c26ad454-c809-4c6f-b6c4-1c3023fa7169_2625x1721.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:955,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:273913,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!OoYq!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 424w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 848w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 1272w, https://substackcdn.com/image/fetch/$s_!OoYq!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbdd2fbbc-dadd-4b7b-8ae0-c629cc32cd84_2625x1721.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic Flows and Evals</figcaption></figure></div><p>Nowadays, a real production ready AI System is rarely simple. Most of the projects fail because the accuracy of outputs can not be easily increased to the levels that are required (or it is deployed without knowing the accuracy, this always leads to eventual silently failure and causes disastrous outcomes). In order to improve the system we bring some level of agency to it via various methods like:</p><ul><li><p>Routing.</p></li><li><p>Reflection.</p></li><li><p>Query rewrites.</p></li><li><p>&#8230;</p></li></ul><p></p><blockquote><p>&#8220;The goal of AI Engineer is to take what is already available and stitch up an AI system that would solve a real business problem.&#8221;</p></blockquote><p></p><p>As the system becomes more complex, most of the nodes in the pipeline produce non-deterministic results. These need to be evaluated because any failure at any step will most likely derail the final output. Evals are HARD and it is not yet a universally solved problem, it is being aggressively researched though. In order to bring robustness to the system we bring additional components like:</p><ul><li><p>Evaluations.</p></li><li><p>Observability.</p></li><li><p>Guardrails.</p></li><li><p>&#8230;</p></li></ul><p></p><h4>A simple example of agency in AI Systems.</h4><p>A very simple example of Agentic system is a basic Agentic RAG.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gp1d!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gp1d!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gp1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png" width="1456" height="1160" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/0912fc27-3a96-47f4-828e-ecb01304f8ae_2019x1609.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1160,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:282278,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!gp1d!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 424w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 848w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 1272w, https://substackcdn.com/image/fetch/$s_!gp1d!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe5b64ee1-417f-4e1b-b33a-3ab4bb46b8e6_2019x1609.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Agentic RAG</figcaption></figure></div><p>These are the steps that describe such system on a high level:</p><ol><li><p>Analysis of the user query: we pass the original user query to a LLM based Agent for analysis. This is where:</p><ol><li><p>The original query can be rewritten, sometimes multiple times to create either a single or multiple queries to be passed down the pipeline.</p></li><li><p>The agent decides if additional data sources are required to answer the query.</p></li></ol></li><li><p>If additional data is required, the Retrieval step is triggered. In Agentic RAG case, we could have a single or multiple agents responsible for figuring out what data sources should be tapped into, few examples:</p><ol><li><p>Real time user data. This is a pretty cool concept as we might have some real time information like current location available for the user.</p></li><li><p>Internal documents that a user might be interested in.</p></li><li><p>Data available on the web.</p></li><li><p>&#8230;</p></li></ol></li><li><p>If there is no need for additional data, we try to compose the answer (or multiple answers) straight via an LLM.</p></li><li><p>The answer (or answers) get analysed, summarised and evaluated for correctness and relevance:</p><ol><li><p>If the Agent decides that the answer is good enough, it gets returned to the user.</p></li><li><p>If the Agent decides that the answer needs improvement, we try to rewrite the usr query and repeat the generation loop.</p></li></ol></li></ol><p>It is clear that there is a lot of non-determinism added to the regular RAG pipeline, even some cases where the pipeline could go into an infinite reasoning loop if not properly interrupted. We will go step by step into the process of evolving your RAG pipelines in the future Newsletter episodes.</p><p>Agentic RAG can already be considered an Agent and it is one of the most common ones in the industry. This is what AI Engineers are dealing with!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p></p><h3>AI Engineering vs. ML and Software Engineering.</h3><p>Lets get this straight, it might seem easy to build a LLM application nowadays. Just connect to a third party API, craft some prompts, connect the app to a chat interface or your e-mail stream and you have yourself an agentic system. And it it is true - it is easy to do exactly that, but the system is only good until it starts breaking apart.</p><p>My observation while talking with companies building with LLMs is that most of them are in early stages where Software Engineers have already built initial applications (usually chat bots, chat bot agents or email agents). Unfortunately, Software Engineers are not used to dealing with non-deterministic systems, that is what ML Engineers do. These systems usually lack proper observability or even some sort of basic automated evaluation - organisations are running blind and are relying on human evaluations for products at scale.</p><p>Just a year and a half ago I would have said that the existence of this new role called AI Engineer is not justified, back then people were openly discussing how now everyone can be building LLM based apps and ship them to production in days. My opinion changed after building several such apps. It is easy to start, but without proper foundation, the projects will fail - we need AI Engineers.</p><p>One could argue that the transition to AI Engineering would be most natural from either ML Engineer, Software Engineer or AI Researcher.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!-Igl!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!-Igl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 424w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 848w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 1272w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!-Igl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png" width="1456" height="1273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png&quot;,&quot;srcNoWatermark&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/150a810e-e291-4eed-83e0-016e5529611d_1969x1721.png&quot;,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1273,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:346939,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!-Igl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 424w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 848w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 1272w, https://substackcdn.com/image/fetch/$s_!-Igl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F8d939e5a-d2ab-421c-a113-ea7d07a5d667_1969x1721.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">The AI Engineer</figcaption></figure></div><p><strong>AI Researchers - </strong>they are masters of prototyping, coming up with novel ideas and testing their hypothesis. Analyse the output data and come up with novel strategies how to keep continuously improving the models. Deep understanding of statistics and ML fundamentals. Nowadays, very likely they are able to run LLM training on distributed systems themselves. </p><ul><li><p>What they might initially lack in skills is the ability to deploy real world production applications and implementing MLOps best practices in the world of LLMs.</p></li></ul><p><strong>ML Engineers - </strong>capable of building and deploying regular Machine Learning models as AI/ML systems with all of the bells and whistles of MLOps. This includes implementation of feedback flywheel and ability to observe and continuously improve the system. Also, ML Engineers are usually involved in Data Engineering to some extent, often utilising ML specific data stores like Feature Stores or Vector DBs.</p><ul><li><p>What they might initially lack in skills is the ability to perform deep research and build production ready high throughput systems as well as implementing and operating regular software best practices.</p></li></ul><p><strong>Software Engineers</strong> - they are great! Capable of crafting complex high throughput, low latency systems that are deterministic. Translating business requirements into complex software flows. Masters of DevOps and software engineering best practices, capable of high velocity development and shipping to production in a safe way.</p><ul><li><p>What they might initially lack in skills is the ability to reason in non-deterministic systems and knowledge how to observe and evaluate them. Also, it is not in their nature to continuously learn non software related topics that could completely shift in a day, requiring re-architecture of the entire system.</p></li></ul><p>It is naive to expect that there would be many professionals out there that would be great at all 3 disciplines - thats a unicorn. That is why I think we will usually see AI Engineers possessing a blend of 2 disciplines and fall into the area of either <em>1., 2. or 3. </em>as depicted in the diagram above. Will we have separate names for these roles? Who knows. The first thing that is coming to my mind would be that maybe the person in <em>3. </em>could be called AI Systems Engineer or AI Architect. Lets see how it evolves!</p><p>Now, there is a trend of new companies started with a lot less engineers outcompeting the incumbents. I believe that full stack AI Engineers will be the best ones positioned to disrupt the market.</p><p></p><h3>What skills would you need to succeed in AI Engineering?</h3><p>There is so much research happening in the field of Agentic applications that it is hard to keep up. As an example, just recently, there has been a paper released with research around how Prompt Formatting can influence the performance of your LLM applications</p><ul><li><p><strong>Research</strong> - white papers need to become your best friend. There is so much research happening in the field of Agentic applications that it is hard to keep up. As an example, just recently, there has been a <a href="https://arxiv.org/abs/2411.10541">paper</a> released with research around how Prompt Formatting can influence the performance of your LLM applications. The truth is that with internal data and compute resources at your disposal, you - the AI Engineer - are best positioned to do your own research on what works and what does not, and you should do it for the sake of your employer.</p></li><li><p><strong>Prompt Engineering</strong> - while it might sound simple, the techniques for prompt engineering and formatting are vast. When it comes to agentic systems, you are also dealing with cross agent prompt dependencies, shared state and memory that is also implemented via prompting. On top of this, everything needs to be evaluated so you will need custom evals for any prompt you are crafting coupled with datasets that you can test on.</p></li><li><p><strong>Software Development - </strong>no questions here, the systems you are deploying need to be solid. You need to know and follow software engineering and DevOps best practices.</p></li><li><p><strong>Infrastructure - </strong>one aspect of this is that you need to be able to deploy your own work, you could say it is part of Software Development. Also you need to understand your data and new types of storage systems like Vector DBs. In general, these are not new, but rarely used by non ML Engineers.</p></li><li><p><strong>Data Engineering</strong> - you would be surprised in how much time you would actually spend understanding, cleaning and processing the data that is then used in your AI Systems. Not everything is about prompting, the hardest part is usually integrating the data sources into your AI applications.</p></li><li><p><strong>MLOps adapted for AI Systems (AgentOps) </strong>- we have introduced a lot of good practices into building AI systems in the past ~5 years via the MLOps movement. Most of them should be transferred when building with LLMs.</p><ul><li><p>Evaluation.</p></li><li><p>Observability. I talk about some of the challenges in observing Agentic systems in one of my articles: </p><div class="digest-post-embed" data-attrs="{&quot;nodeId&quot;:&quot;876ab9d8-10ee-49e0-a311-d181bfb78c87&quot;,&quot;caption&quot;:&quot;&#128075; I am Aurimas. I write the SwirlAI Newsletter with the goal of presenting complicated Data related concepts in a simple and easy-to-digest way. My mission is to help You UpSkill and keep You updated on the latest news in GenAI, MLOps, Data Engineering, Machine Learning and overall Data space.&quot;,&quot;cta&quot;:null,&quot;showBylines&quot;:true,&quot;size&quot;:&quot;sm&quot;,&quot;isEditorNode&quot;:true,&quot;title&quot;:&quot;Observability in LLMOps pipeline - Different Levels of Scale&quot;,&quot;publishedBylines&quot;:[{&quot;id&quot;:14122259,&quot;name&quot;:&quot;Aurimas Grici&#363;nas&quot;,&quot;bio&quot;:&quot;I have over a decade of work experience in various data related fields: Data Analytics, Data Science, Machine Learning, Data Engineering, Cloud Engineering. For three years I have led teams working with Data and Infrastructure.&quot;,&quot;photo_url&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/746f0396-fc7f-4690-b75c-ef482a8cb1c7_3684x3683.jpeg&quot;,&quot;is_guest&quot;:false,&quot;bestseller_tier&quot;:null}],&quot;post_date&quot;:&quot;2024-10-21T08:45:41.936Z&quot;,&quot;cover_image&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/faed7241-c0f0-492d-b546-56cd44d7319e_2926x2192.png&quot;,&quot;cover_image_alt&quot;:null,&quot;canonical_url&quot;:&quot;https://www.newsletter.swirlai.com/p/observability-in-llmops-pipeline&quot;,&quot;section_name&quot;:null,&quot;video_upload_id&quot;:null,&quot;id&quot;:150001948,&quot;type&quot;:&quot;newsletter&quot;,&quot;reaction_count&quot;:37,&quot;comment_count&quot;:0,&quot;publication_id&quot;:null,&quot;publication_name&quot;:&quot;SwirlAI Newsletter&quot;,&quot;publication_logo_url&quot;:&quot;https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F27ed734e-48b5-446d-a93d-5a54178a0e34_1024x1024.png&quot;,&quot;belowTheFold&quot;:true,&quot;youtube_url&quot;:null,&quot;show_links&quot;:null,&quot;feed_url&quot;:null}"></div></li><li><p>Prompt tracking and versioning.</p></li><li><p>Feedback and the continuous system improvement flywheel.</p></li></ul></li></ul><p></p><blockquote><p>&#8220;Not everything is about prompting, the hardest part is usually integrating the internal data sources into your AI applications.&#8221;</p></blockquote><p></p><h4>How would your day-to-day look like as an AI Engineer?</h4><p>Surprisingly, there is a lot of non-engineering related work in your day-to-day as an AI engineer. The organisations are still on a hype train around LLMs, and you would be pushed to implement them wherever, even if there is no good fit. One of your key responsibilities will be to coach the organisation and be at the forefront of deciding where and if LLMs are even needed to solve a business problem.</p><p>You will be researching, reading papers and blogs, following what the top players are doing a lot. Remember when Anthropic introduced <a href="https://www.anthropic.com/news/contextual-retrieval">contextual embeddings</a>? It seems painfully obvious now that it is a good idea, but it took some time until someone wrote about it and made it feasible cost-wise via prompt caching.</p><p>You will spend a lot of time figuring out how to build your test datasets so that you can evaluate your systems. Collaboration with your stakeholders will be key, e.g. you will need tight integrations with your front-end systems for feedback implementation.</p><p>Only then the engineering part starts! And all of the above are worth it :)</p><p></p><blockquote><p>&#8220;One of your key responsibilities will be to coach the organisation and be at the forefront of deciding where and if LLMs are even needed to solve a business problem.&#8221;</p></blockquote><p></p><h3>What is the future of AI Engineering?</h3><p>My prediction is that every company will have a set of agentic flows automating their processes within the upcoming few years. Hence, all companies will need the skills of AI Engineers. On top of it, AI applications will be the ones bringing the most value to the business and will be required to keep up with competition as everyone will be doubling down on this technology.</p><p>AI Engineer is positioned to be the hottest role in the upcoming years. The salaries are high and the demand will keep increasing due to the shortage of talent in the field. 2025 will be the year of agents, 2026 - very likely the year of multi agent systems and autonomous agents.</p><p>On top of this, AI Engineers will be and are best positioned to take on the task of building new companies with minimal resources. The more full-stack AI Engineer is, the more power she will have at her fingertips.</p><p>If you want to break into the role or level up as an AI engineer, be sure to subscribe to the Newsletter. Lets go build!</p><p></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/subscribe?"><span>Subscribe now</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/sponsorships&quot;,&quot;text&quot;:&quot;Partner with SwirlAI&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/sponsorships"><span>Partner with SwirlAI</span></a></p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Share&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://www.newsletter.swirlai.com/p/what-is-ai-engineering?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Share</span></a></p>]]></content:encoded></item></channel></rss>