23 Comments
User's avatar
Karolina Griciune's avatar

Liked it a lot! Very informative and down-to-earth.

Aurimas Griciūnas's avatar

Hope it will help to put many on the right up-skilling path. Glad you liked it!

Paul Iusztin's avatar

Amazing roadmap, man! So eager to see how the world of GraphRAG evolves in the following years. Semantic search has soooo many limitations GraphRAG can solve.

Aurimas Griciūnas's avatar

Thanks and looking forward to that as well. I believe GraphRAG will be widely adopted in Long-Term Memory for Agents, but mostly Episodic and not Semantic.

Paul Iusztin's avatar

Why not semantic? In case you need an ontology of your entities within your system, in my opinion, you have no other option than GraphRAG. For example, to understand that "X is related to Y

But maybe the name of "semantic memory" is confusing 😂

Aurimas Griciūnas's avatar

Because preprocessing (relation extraction) of such data is and will be too costly. Edge cases that could be adopted:

- Small data sets.

- Structured data.

- Relations are already available.

That's just my observation though. Since LLMs are often used to extract the relationships, it is just cost restrictive. Who knows, maybe when LLM costs continue to drop we might see wider adoption.

The potential is big for sure.

Ibraheem's avatar

Very informative article 💯

Aurimas Griciūnas's avatar

Thank you, glad you found it useful!

Michael's avatar

Nice and practical guide, I like it. From the perspective of math, is my impression correct that it is needed mostly at start to learn basics and the rest is more of the “engineering” part? If I want to learn math behind LLMs, is common statistical models knowledge fine to go straight to neural networks or do I need to learn Random Forest, XGboost and other ML modeling approaches first? What would be your opinion?

Aurimas Griciūnas's avatar

In general, the Math behind neural networks is even simpler compared to algorithms like XGboost and similar. They all serve to solve different use cases and simpler models like Random Forest or XGboost tend to perform better in most cases as it is harder to properly train a neural network (including the fact that they take up more resources to be trained).

When it comes to LLMs, while you don't need the math in your day-to-day, it is useful to have at least statistics fundamentals, but not because you will be training the models but rather for evaluation purposes.

Having said all of this, I would suggest continuously learning fundamental math and statistics as you build your LLM applications as you will be able to think in more depth about the behaviour of non-deterministic systems, which LLM based systems are.

George Bullock's avatar

This is great. You perfectly managed breadth with depth. I like the clear and opinionated explanations. I hope you will update this post regularly so I can stay up to date on this topic 🤞🏾.

Aurimas Griciūnas's avatar

Glad you like it! I was thinking of updating it regularly, but some other way to do it might be better as Newsletters tend to not be the best place for dynamic content :)

George Bullock's avatar

Agree there are better options than a newsletter. Have you considered a static site? I can imagine a two-page site with the main content page and an optional change log page. I would totally book that site. You could use Substack for short updates posts to let people know what’s changed and why.

Aurimas Griciūnas's avatar

I had similar thoughts, will have to revisit after the traveling that I will have to do in the next two months :)

Alex Razvant's avatar

Golden! 🔥

Aurimas Griciūnas's avatar

Thanks Alex, glad you like it!

Miguel Otero Pedrido's avatar

Great article!

Aurimas Griciūnas's avatar

Thank you Miguel :) Hope many will find it useful for breaking into AI Engineering.

Dung Tran's avatar

incredible roadmap! love that you made it easy to navigate to the relevant resources.

Naina Chaturvedi's avatar

++ Good Post, Also, start here Compilation of 100+ Most Asked System Design, ML System Design Case Studies and LLM System Design

https://open.substack.com/pub/naina0405/p/important-compilation-of-most-asked?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

Goutham's avatar

clean and to the point

Muhammad Ahasanul Hoque's avatar

Creative sequence for learning .