Cataloging Metadata

Metadata gives a context to the associated data and improves its understanding. It helps organizations take informed decisions. That is the reason most of the data driven organizations are moving towards metadata management.

Benefits of metadata management:
  • Better understanding of the business data, its creation, manipulation, and the business rules associated with it.
  • Ease in locating specific data in varied data set across the organization
  • Tracking data upstream and downstream to know how a change impacts the link and workflow
  • Ensuring data quality: Since metadata defines data, managing it correctly ascertains data correctness and consistency. It is the key to identifying malformed data and preventing database integrity issues.
Wide penetration of big data and data governance rules are other factors that have made metadata management a necessity.

How Discovery helps you manage metadata

The Catalog page of Spectrum Discovery gives you a unified view of all your configured connections. You can discover the assets from all the connections and perform these tasks on the discovered metadata:

  • Search for specific assets (table, column, or view) in your discovered connections.
  • Add tags to assets to give those a relevant context and make those easily accessible later.

Getting started with Discovery

Let’s say, in your organization, data resides in two types of data sources
  • A NoSQL database, such as Apache Cassandra
  • A relational database, such as MS SQL
To access this data on the Catalog dashboard, you need to first connect to these data sources. In Spectrum Discovery, you can do it using the Connections main menu option.
Note: See the sections, Connecting to Apache Cassandra and Connecting to a JDBC Database for details on configuring connections.
Once you have created the relevant connections and tested it:
  • These connections become visible on the Catalog page.
  • Run discovery on the connections (preferably one at a time).
  • The discovery process fetches all the assets in the connections and presents a unified view of data
  • You can create a catalog of the required data across your connections and tag those for usage in other modules.