Skip to main content

Visual Stream Processing

Nussknacker provides a drag and drop visual authoring tool (Nussknacker Designer) which allows to define the logic of stream processing without the need to write low level Flink code. In Nussknacker the processing logic is referred to as scenario. Scenarios are composed of processing nodes; conceptually data “flow” from a node to node. A typical scenario definition diagram starts with a data source (typically Kafka stream) and ends with a data sink (typically Kafka stream).

split graph Sample scenario. Events are read from the 'financialOperation' stream, processed and subsequently written to two output streams: "24h limit alert" and "1h limit alert".

Node types#

The types and capabilities of nodes are presented below.

Check Glossary for explanation of a difference between component and a node.

Basic nodes#

Data sources:#

  • JSON data stream
  • AVRO data stream - JSON data stream whose schema is defined in a Schema Registry

Data sinks:#

  • JSON data stream
  • AVRO data stream - JSON data stream whose schema is defined in a Schema Registry

Simple processing nodes:#

  • Filter - forwards "downstream" only those records which meet filtering criteria.
  • Split - split stream into 2 parallel branches, each branch carries the same data.
  • Switch - allows to route data to a processing branch based on the filtering criteria. Equivalent of switch (or case) statement in programming languages.
  • Union - similarly to SQL union - produces union of two input data streams.

Aggregations in time windows#

Data aggregations in time windows are the very essence of the stream processing. Nussknacker allows to perform aggregations in all 3 window types supported by Flink:

  • Tumbling
  • Sliding
  • Session

split graph Example of a session window processing node parameterization - events are counted in a session window and grouped by transactionID field present in the incoming data stream. The result is a list of status values from incoming events - one list per session.

Data enrichment#

  • SQL - allows to enrich processed events with data coming from an external sql database (jdbc compliant).

Other stream manipulation nodes#

  • Previous value - allows to access data of a preceding event with the same (user defined) key as the current event. Useful when detection of change of state is needed.
  • Outer join - equivalent of the SQL outer join

Axillary nodes#

  • Delay - delays incoming records by specified amount of time.

SpEL#

SpEL (Spring Expression Language) is a powerful expression language that supports querying and manipulating a data object at runtime. It is used by Nussknacker to access data processed by the node and expand the node's configuration capabilities. Some examples:

  • create boolean expression (for example in filters) based on logical or relational (equal, greater than, etc) operators
  • access, query and manipulate fields of the incoming data record
  • format records (events) written to data sinks
  • provide helper functions like date and time, access to system variables, etc
  • and many more.

split graph Example of SpeL expression used in a filter; the .size operator returns number of elements in a list

Schema Registry Support#

Unlike tables in relational databases, Kafka topics have no information about the data transported in the data stream. While this may seem to be very flexible, it may be a problem when data records in a topic should have identical syntax and semantics. To overcome this problem Confluent implemented a Schema Registry, which when used in conjunction with Kafka may store topic metadata. When writing to the topic, information from Schema Registry is used to validate the output data against the schema. When reading from the topic, information from Schema Registry is used to add semantics and syntax information about the data fields in the consumed events. While authoring a scenario Nussknacker Designer uses information from the Schema Registry to give hints to the user suggesting field names available in the given context.

Nussknacker uses Avro as a schema definition framework.

split graph Example of a hint

Going forward#

Nussknacker's development plans can be found on our Roadmap page. You can also help us to make it an even greater product by contributing a development idea in a Nussknacker's repository on github.