How do I turn the data frame for door in spark python (pyspark)?

Asked 2 years ago, Updated 2 years ago, 92 views

I just applied the for statement to the data frame as below, and the column is printed. I hope the value values come out as lists or tuples, like when you fetchall() on db instead of columns Is there a way?

foriindf:
    print(i)

python dataframe scala

2022-09-20 19:55

1 Answers

Use the collect method in dataframe

val jsonStr = Seq("""{"id" : "1", "name": "aaaaa", "addr": "seoul", "data": 10}""", 
                  """{"id" : "2", "name": "bbbbb", "addr": "pusan", "data": 20}""",
                  """{"id" : "3", "name": "aaaaa", "addr": "pusan", "data": 30}""",
                  """{"id" : "4", "name": "bbbbb", "addr": "seoul", "data": 40}""",
                  """{"id" : "5", "name": "aaaaa", "addr": "pusan", "data": 50}""",
                  """{"id" : "6", "name": "aaaaa", "addr": "pusan", "data": 60}""",
                  """{"id" : "7", "name": "bbbbb", "addr": "seoul", "data": 70}""") 

val rddData = spark.sparkContext.parallelize(jsonStr)
val resultDF = spark.read.json(rddData)


resultDF.collect()
res14: Array[org.apache.spark.sql.Row] = Array([seoul,10,1,aaaaa], [pusan,20,2,bbbbb], [pusan,30,3,aaaaa], [seoul,40,4,bbbbb], [pusan,50,5,aaaaa], [pusan,60,6,aaaaa], [seoul,70,7,bbbbb])

resultDF.collect().foreach(i => println(i))
[seoul,10,1,aaaaa]
[pusan,20,2,bbbbb]
[pusan,30,3,aaaaa]
[seoul,40,4,bbbbb]
[pusan,50,5,aaaaa]
[pusan,60,6,aaaaa]
[seoul,70,7,bbbbb]


2022-09-20 19:55

If you have any answers or tips


© 2024 OneMinuteCode. All rights reserved.