Advertisement

Data Lake

Last updated: July 11, 2021

What Does Data Lake Mean?

A data lake is an easily accessible, centralized storage repository for large volumes of structured and unstructured data. A data lake has a flat architecture and uses object storage to store data.

Data lakes play an important role in helping data scientists visualize and analyze data from disparate data in their native formats. In data science, this is an especially important consideration when the scope of the data -- and its uses -- may not yet be fully known.

Although data lakes offer strong data access benefits, they require a management component to help users find the most relevant data, understand relationships and integrate heterogeneous data sources. Popular data lake platforms include:

A data lake may also be referred to as a schema-agnostic or schema-less data repository.

Advertisement

Techopedia Explains Data Lake

The data lake architecture is a store-everything approach to big data. Data is not classified when it is stored in the repository and the value of the data is not clear at the outset. When the data is accessed, only then will it be classified and organized for analysis.

Data lakes were developed to promote the accessibility and reuse of data. Hadoop, an open-source framework for processing and analyzing big data, can be used to sift through the data in the repository.

Advertisement

Share this Term

  • Facebook
  • LinkedIn
  • Twitter

Related Reading

Tags

Data ManagementEmerging TechnologyBig DataHadoop

Trending Articles

Go back to top