Types of Flows

A flow is a series of operations that takes data from some source, processes that data, then writes the output to a destination. The processing of the data can be take the form of simple sorting to more complex data quality and enrichment actions.

While the concept of a flow is simple, you can design very complex flows with branching paths, multiple sources of input, and multiple output destinations. The New Flow page is the starting point to build a job flow, service flow, or subflow.

Job

A job is a flow that performs batch processing. A job reads data from one or more files or databases, processes that data, and writes the output to one or more files or databases. Run jobs manually in Flow Designer or from a command line using the job executor.

The job example, below, uses the Read from File stage for input and two Write to File stages as output.

Service

A service is a flow that you can access as web services or using the Spectrum Technology Platform API. You pass a record to the service and optionally specify the options to use when processing the record. The service processes the data and returns the data.

Some services become available when you install a module. For example, when you install the Universal Addressing Module the service ValidateAddress becomes available on your system. In other cases, you must create a service in Flow Designer, then expose that service on your system as a user-defined service. For example, Spectrum Spatial stages are not available as services unless you first create a service using the module's stages.

You can also design your own custom services. For example, you can create a flow that helps to determine whether an address is at risk for flooding.

Note: Since the service name, option name, and field name ultimately become XML elements, they may contain characters that are invalid in XML element names (for example, spaces are not valid). Services not meeting rules of well-formed XML will function but will not be exposed as web services.

Subflow

A subflow is a flow that can be reused within other flows. Subflows are useful when you want to create a reusable process that can be easily incorporated into flows. For example, you might want to create a subflow that performs deduplication using certain settings in each stage so that you can use the same deduplication process in multiple flows.

You could then use this subflow in a flow. For example, you could use the deduplication subflow within a flow that performs geocoding so that the data is deduplicated before the geocoding operation.

In this type of flow, data would be read in from a database then passed to the deduplication subflow, where it would be processed through Match Key Generator, then Intraflow Match, then Best of Breed, and finally sent out of the subflow and on to the next stage in the parent flow, in this case Geocode US Address.

Process flows

A process flow runs a series of activities such as jobs and external applications. Each activity in the process flow runs after the previous activity finishes. Process flows are useful if you want to run multiple flows in sequence or if you want to run an external program. For example, a process flow could run a job to standardize names, validate addresses, then invoke an external application to sort the records into the proper sequence to claim postal discounts. Such a process flow would look like this:

In this example, the jobs Standardize Names and Validate Addresses are exposed jobs on the Spectrum Technology Platform server. Run Program invokes an external application, and the Success activity indicates the end of the process flow.