Oct 29, 2022Liked by Aurimas Griciūnas

So good, so clear... Good job 👍🙏

Jan 29, 2023Liked by Aurimas Griciūnas

Hello Aurimas,

Thanks for this reading, you explained cross-cutting topics really useful 🤓

Hey, thank you so much for the content.

I have a question regarding the sentence from the Machine Learning Pipeline's Feature Retrieval section:

You pointed out that the train/test/validation split should be performed there, which is a good idea, but what isn't clear to me is how to avoid Data Leakage at this point; If we calculate the Features with the whole dataset, then the test and validation set will be contaminated.

Do you propose creating different Feature Tables for all three datasets?

Kind Regards!

