pyspark.sql.functions.regexp

pyspark.sql.functions.regexp(str: ColumnOrName, regexp: ColumnOrName) → pyspark.sql.column.Column[source]

Returns true if str matches the Java regex regexp, or false otherwise.

New in version 3.5.0.

Parameters
strColumn or str

target column to work on.

regexpColumn or str

regex pattern to apply.

Returns
Column

true if str matches a Java regex, or false otherwise.

Examples

>>>
>>> import pyspark.sql.functions as sf
>>> spark.createDataFrame(
...     [("1a 2b 14m", r"(\d+)")], ["str", "regexp"]
... ).select(sf.regexp('str', sf.lit(r'(\d+)'))).show()
+------------------+
|REGEXP(str, (\d+))|
+------------------+
|              true|
+------------------+
>>>
>>> import pyspark.sql.functions as sf
>>> spark.createDataFrame(
...     [("1a 2b 14m", r"(\d+)")], ["str", "regexp"]
... ).select(sf.regexp('str', sf.lit(r'\d{2}b'))).show()
+-------------------+
|REGEXP(str, \d{2}b)|
+-------------------+
|              false|
+-------------------+
>>>
>>> import pyspark.sql.functions as sf
>>> spark.createDataFrame(
...     [("1a 2b 14m", r"(\d+)")], ["str", "regexp"]
... ).select(sf.regexp('str', sf.col("regexp"))).show()
+-------------------+
|REGEXP(str, regexp)|
+-------------------+
|               true|
+-------------------+