pyspark.sql.DataFrameReader.text¶

DataFrameReader.text(paths: Union[str, List[str]], wholetext: bool = False, lineSep: Optional[str] = None, pathGlobFilter: Union[bool, str, None] = None, recursiveFileLookup: Union[bool, str, None] = None, modifiedBefore: Union[bool, str, None] = None, modifiedAfter: Union[bool, str, None] = None) → DataFrame[source]¶

Loads text files and returns a DataFrame whose schema starts with a string column named “value”, and followed by partitioned columns if there are any. The text files must be encoded as UTF-8.

By default, each line in the text file is a new row in the resulting DataFrame.

New in version 1.6.0.

Parameters

pathsstr or list: string, or list of strings, for input path(s).

Other Parameters

Extra options: For the extra options, refer to Data Source Option in the version you use.

Examples

>>> df = spark.read.text('python/test_support/sql/text-test.txt')
>>> df.collect()
[Row(value='hello'), Row(value='this')]
>>> df = spark.read.text('python/test_support/sql/text-test.txt', wholetext=True)
>>> df.collect()
[Row(value='hello\nthis')]

pyspark.sql.DataFrameReader.table pyspark.sql.DataFrameWriter.bucketBy