Amazing roadmap, man! So eager to see how the world of GraphRAG evolves in the following years. Semantic search has soooo many limitations GraphRAG can solve.
Thanks and looking forward to that as well. I believe GraphRAG will be widely adopted in Long-Term Memory for Agents, but mostly Episodic and not Semantic.
Why not semantic? In case you need an ontology of your entities within your system, in my opinion, you have no other option than GraphRAG. For example, to understand that "X is related to Y
But maybe the name of "semantic memory" is confusing 😂
Because preprocessing (relation extraction) of such data is and will be too costly. Edge cases that could be adopted:
- Small data sets.
- Structured data.
- Relations are already available.
That's just my observation though. Since LLMs are often used to extract the relationships, it is just cost restrictive. Who knows, maybe when LLM costs continue to drop we might see wider adoption.
Nice and practical guide, I like it. From the perspective of math, is my impression correct that it is needed mostly at start to learn basics and the rest is more of the “engineering” part? If I want to learn math behind LLMs, is common statistical models knowledge fine to go straight to neural networks or do I need to learn Random Forest, XGboost and other ML modeling approaches first? What would be your opinion?
In general, the Math behind neural networks is even simpler compared to algorithms like XGboost and similar. They all serve to solve different use cases and simpler models like Random Forest or XGboost tend to perform better in most cases as it is harder to properly train a neural network (including the fact that they take up more resources to be trained).
When it comes to LLMs, while you don't need the math in your day-to-day, it is useful to have at least statistics fundamentals, but not because you will be training the models but rather for evaluation purposes.
Having said all of this, I would suggest continuously learning fundamental math and statistics as you build your LLM applications as you will be able to think in more depth about the behaviour of non-deterministic systems, which LLM based systems are.
This is great. You perfectly managed breadth with depth. I like the clear and opinionated explanations. I hope you will update this post regularly so I can stay up to date on this topic 🤞🏾.
Glad you like it! I was thinking of updating it regularly, but some other way to do it might be better as Newsletters tend to not be the best place for dynamic content :)
Agree there are better options than a newsletter. Have you considered a static site? I can imagine a two-page site with the main content page and an optional change log page. I would totally book that site. You could use Substack for short updates posts to let people know what’s changed and why.
Liked it a lot! Very informative and down-to-earth.
Hope it will help to put many on the right up-skilling path. Glad you liked it!
Amazing roadmap, man! So eager to see how the world of GraphRAG evolves in the following years. Semantic search has soooo many limitations GraphRAG can solve.
Thanks and looking forward to that as well. I believe GraphRAG will be widely adopted in Long-Term Memory for Agents, but mostly Episodic and not Semantic.
Why not semantic? In case you need an ontology of your entities within your system, in my opinion, you have no other option than GraphRAG. For example, to understand that "X is related to Y
But maybe the name of "semantic memory" is confusing 😂
Because preprocessing (relation extraction) of such data is and will be too costly. Edge cases that could be adopted:
- Small data sets.
- Structured data.
- Relations are already available.
That's just my observation though. Since LLMs are often used to extract the relationships, it is just cost restrictive. Who knows, maybe when LLM costs continue to drop we might see wider adoption.
The potential is big for sure.
Very informative article 💯
Thank you, glad you found it useful!
Nice and practical guide, I like it. From the perspective of math, is my impression correct that it is needed mostly at start to learn basics and the rest is more of the “engineering” part? If I want to learn math behind LLMs, is common statistical models knowledge fine to go straight to neural networks or do I need to learn Random Forest, XGboost and other ML modeling approaches first? What would be your opinion?
In general, the Math behind neural networks is even simpler compared to algorithms like XGboost and similar. They all serve to solve different use cases and simpler models like Random Forest or XGboost tend to perform better in most cases as it is harder to properly train a neural network (including the fact that they take up more resources to be trained).
When it comes to LLMs, while you don't need the math in your day-to-day, it is useful to have at least statistics fundamentals, but not because you will be training the models but rather for evaluation purposes.
Having said all of this, I would suggest continuously learning fundamental math and statistics as you build your LLM applications as you will be able to think in more depth about the behaviour of non-deterministic systems, which LLM based systems are.
This is great. You perfectly managed breadth with depth. I like the clear and opinionated explanations. I hope you will update this post regularly so I can stay up to date on this topic 🤞🏾.
Glad you like it! I was thinking of updating it regularly, but some other way to do it might be better as Newsletters tend to not be the best place for dynamic content :)
Agree there are better options than a newsletter. Have you considered a static site? I can imagine a two-page site with the main content page and an optional change log page. I would totally book that site. You could use Substack for short updates posts to let people know what’s changed and why.
I had similar thoughts, will have to revisit after the traveling that I will have to do in the next two months :)
Golden! 🔥
Thanks Alex, glad you like it!
Great article!
Thank you Miguel :) Hope many will find it useful for breaking into AI Engineering.