Data Hub Module

The Data Hub Module provides a persistent repository to help you manage and understand your most critical data assets. It supports Master Data Management and Business Intelligence initiatives. The Data Hub Module is built on top of a graph database that allows companies to rapidly capture and evolve data models based on real-world complex relationships that may span processes, interactions, hierarchies, roles and domains, and extract actionable insight to drive business outcomes.


The Data Hub Module consists of:

  • Write to Hub—A sink stage that allows you to intuitively create a model using input data to define entities, relationships, and properties. Upon execution Write to Hub loads the data into the hub.
  • Import to Hub—A stage that uses two incoming channels of data, one for entities and one for relationships, to define a new model or populate an existing model. Includes an optional outgoing error port that collects records not successfully processed by the dataflow.
  • Read from Hub—A source stage that uses a saved or new query to read the data inside an existing model. It then returns that data as fields in your dataflow's output stage and makes it available for use with other stages or processes.
  • Query Hub—An intermediate stage that uses incoming data rows to define queries that extract specific entities and relationships from a model. For example, Query Hub can be used as part of a service to understand a customer's influence score within the network or determine if a customer record already exists in the hub.
  • Merge Entities—A stage that accesses data from an existing model and enables you to merge two or more entities into one.
  • Split Entity—A stage that accesses data from an existing model and enables you to split one entity into two or more new entities.
  • Delete from Hub—A stage that enables you to delete entities and relationships from an existing model.
  • Relationship Analysis Client—A web browser tool that provides a visual interface for viewing relationships and hierarchies within the hub, discovering hidden or non-obvious relationships, creating what-if scenarios, performing temporal or geospatial analysis, creating rules-driven event triggers, running centrality algorithms to determine influence score either against the entire network or against the data being visualized within the client.
  • Data Hub Browser—A discovery tool where you can search the contents of a model by browsing the results of a natural-language inspired query based on the model’s metadata.