Skip to content
Menu
Cloud Gal 42
  • Home
Cloud Gal 42

Data Discovery

July 21, 2021July 14, 2021 by admin

One of the most important new trends in business intelligence is data discovery. It is a departure from traditional business intelligence in that it emphasizes interactive, visual analytics rather than static reporting. The goal of data discovery is to work with and enable people to use their intuition to find meaningful and important information in data. This process usually consists of asking questions of the data in some way, seeing results visually, and refining the questions.

Contrast this with the traditional approach, which is for information consumers to ask questions, which causes reports to be developed, which are then fed to the consumer, which may generate more questions, which will generate more reports.

Data Discovery Approaches

Progressive companies consider data to be a strategic asset and understand its importance in driving innovation, differentiation, and growth. But leveraging data and transforming it into real business value requires a holistic approach to business intelligence and analytics. This means going beyond the scope of most data visualization tools and is dramatically different from the business intelligence (BI) platforms of years past.

The continuing evolution of data discovery in the enterprise and the cloud is being driven by the trends listed below:

  • Big data: On big data projects, data discovery is both more important and more challenging. Not only is the volume of data that must be efficiently processed for discovery larger, but the diversity of sources and formats presents challenges that make many traditional methods of data discovery fail. Cases where big data initiatives also involve rapid profiling of high-velocity big data make data profiling harder and less feasible using existing toolsets.
  • Real-time analytics: The ongoing shift toward (nearly) real-time analytics has created a new class of use cases for data discovery. These use cases are valuable but require data discovery tools that are faster, more automated, and more adaptive.
  • Agile analytics and agile business intelligence: Data scientists and business intelligence teams are adopting more agile, iterative methods of turning data into business value. They perform data discovery processes more often and in more diverse ways, such as profiling new data sets for integration, seeking answers to new questions emerging this week based on last week’s analysis, or finding alerts about emerging trends that may warrant new analysis work streams.

Different Data Discovery Techniques

Data discovery techniques vary, but they all aid the user by consolidating data within a defined context. That context enables quick evaluation and, ideally, the creation of actionable information. Three basic methods are normally used to discover, categorize, and present data:

  1. Metadata: This data discovery option uses automated tools to discover data element semantics within data sets. Relational databases store metadata and use it to describe column and table attributes. A search for possible credit card numbers in a database, for example, could use column attributes (e.g., column name, data type, or data size) to identify numbers that could possibly be used to represent a credit card number. The metadata method is the most common data analysis technique.
  2. Labels: Data elements can often be grouped based on a descriptive term. The term, or tag, can then be used for subsequent data discovery processes. An important aspect of labels is that they must be applied when the data is created. Tags can then be added over time to provide references or additional information. Labeling is less rigid than metadata and is more commonly used with flat files. This data discovery option becomes increasingly useful as more database request modules (DBRMs) move to Indexed Sequential Access Method (ISAM) or quasi-relational data storage, a cloud database service approach popular for handling rapidly growing data sets.
  3. Content analysis: This process analyzes data using pattern matching; hashing; and statistical, lexical, or other types of probability analysis. Content analysis is a growing trend across multiple industries, as it has proven successful in data loss prevention (DLP) and web content analysis products.

Related article – Implementation of Data Discovery

Related

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Role of AI/ML in Cybersecurity
  • QuickGuide: Security on OCI
  • The Cloud Management Plane
  • Secure Installation and Configuration of Virtualized Cloud Datacenters
  • Cloud Datacenter: Hardware-specific Security Configuration Requirements

Recent Comments

  • Rafael on Installing Debian on OCI
  • Jorge on Installing Debian on OCI
  • admin on Installing Debian on OCI
  • Andreas on Installing Debian on OCI
  • admin on Installing Debian on OCI

Archives

  • December 2022
  • February 2022
  • September 2021
  • July 2021
  • June 2021
  • May 2021
  • April 2021
  • February 2021
  • January 2021
  • November 2020
  • October 2020

Categories

  • aws
  • bcdr
  • cloud
  • cloudsecurity
  • compliance
  • informationsecurity
  • oracle
  • pci
  • QuickGuide
  • security
©2025 Cloud Gal 42 | Powered by WordPress and Superb Themes!