Save python spark DataFrame to S3 with the filename specified when outputting csv

Asked 2 years ago, Updated 2 years ago, 79 views

I am currently developing a Python script, and
in the processing of that script. The contents of Spark's DataFrame are exported as CSV to S3
I would like to specify a file name in the script when I print it, but I can't find a good way...Could you please let me know if it's any small matter?

Change to DataFreame type for #Spark
sdf2 = park.createDataFrame(pdf_out.fillna('))
sdf3 = park.createDataFrame(pdf_out_null.fillna('))

# CSV File Storage Path
successpath='s3://****/CSV/success'
temp='s3://****/CSV/temp'


# Output to S3 as CSV file
sdf2.coalesce(1).write.mode('append').csv(successpath, header=True)
sdf3.coalesce(1).write.mode('append').csv(temppath, header=True)

python csv pyspark

2022-09-29 21:22

1 Answers

I'm sorry to just quote you, but Here I received an applicable question and an answer.

Also, it looks like the file storage path doesn't have an extension.


2022-09-29 21:22

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.