(outputMode: str) → pyspark.sql.streaming.readwriter.DataStreamWriter[source]¶ Specifies how data of a streaming DataFrame/Dataset is written to a streaming sink.
New in version 2.0.0.
Options include:
- append: Only the new rows in the streaming DataFrame/Dataset will be written to
the sink
- complete: All the rows in the streaming DataFrame/Dataset will be written to the sink
every time these are some updates
- update: only the rows that were updated in the streaming DataFrame/Dataset will be
written to the sink every time there are some updates. If the query doesn’t contain aggregations, it will be equivalent to append mode.
This API is evolving.
>>> df = spark.readStream.format("rate").load() >>> df.writeStream.outputMode('append') <pyspark.sql.streaming.readwriter.DataStreamWriter object ...>
The example below uses Complete mode that the entire aggregated counts are printed out.
>>> import time >>> df = spark.readStream.format("rate").option("rowsPerSecond", 10).load() >>> df = df.groupby().count() >>> q = df.writeStream.outputMode("complete").format("console").start() >>> time.sleep(3) >>> q.stop()