Save python spark DataFrame to S3 with the filename specified when outputting csv

I am currently developing a Python script, and
in the processing of that script. The contents of Spark's DataFrame are exported as CSV to S3
I would like to specify a file name in the script when I print it, but I can't find a good way...Could you please let me know if it's any small matter?

Change to DataFreame type for #Spark
sdf2 = park.createDataFrame(pdf_out.fillna('))
sdf3 = park.createDataFrame(pdf_out_null.fillna('))

# CSV File Storage Path
successpath='s3://****/CSV/success'
temp='s3://****/CSV/temp'


# Output to S3 as CSV file
sdf2.coalesce(1).write.mode('append').csv(successpath, header=True)
sdf3.coalesce(1).write.mode('append').csv(temppath, header=True)

python csv pyspark

2022-09-29 21:22

1 Answers

I'm sorry to just quote you, but Here I received an applicable question and an answer.

Also, it looks like the file storage path doesn't have an extension.

2022-09-29 21:22

If you have any answers or tips

Popular Tags

python x 4647

android x 1593

java x 1494

javascript x 1427

c x 927

c++ x 878

ruby-on-rails x 696

php x 692

python3 x 685

html x 656