Datalake

“A data lake is a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. The data structure and requirements are not defined until the data is needed.”

James Dixon, the founder and CTO of Pentaho, although what makes this wisdom is that it’s been repeated.

… whereas a data warehouse, has structure and a schema!

Datalake

Leave a Reply

Your email address will not be published. Required fields are marked *