Quantcast
Channel: User learning... - Stack Overflow
Viewing all articles
Browse latest Browse all 38

Spark write result of Array[Array[Any]] to file

$
0
0

I have input like following sample

3070811,1963,1096,,"US","CA",,1,3022811,1963,1096,,"US","CA",,1,563033811,1963,1096,,"US","CA",,1,23

After writing replacing the empty chars with 0, i am trying to write the result to textFile and i am getting

scala> result.saveAsTextFile("data/result")<console>:34: error: value saveAsTextFile is not a member of Array[Array[Any]]              result.saveAxtFile("data/result")

Here is the solution

scala> val file2 = sc.textFile("data/file.txt")scala> val mapper = file2.map(x => x.split(",",-1))scala> val result = mapper.map(x => x.map(x => if(x.isEmpty) 0 else x)).collect()result: Array[Array[Any]] = Array(Array(3070811, 1963, 1096, 0, "US", "CA", 0, 1, 0), Array(3022811, 1963, 1096, 0, "US", "CA", 0, 1, 56), Array(3033811, 1963, 1096, 0, "US", "CA", 0, 1, 23))scala> result.saveAsTextFile("data/result")<console>:34: error: value saveAsTextFile is not a member of Array[Array[Any]]              result.saveAsTextFile("data/result")

I have also tried following and it failed as well

scala> val output = result.map(x => (x(0),x(1),x(2),x(3), x(4), x(5), x(7), x(8)))output: Array[(Any, Any, Any, Any, Any, Any, Any, Any)] = Array((3070811,1963,1096,0,"US","CA",1,0), (3022811,1963,1096,0,"US","CA",1,56), (3033811,1963,1096,0,"US","CA",1,23))scala> output.saveAsTextFile("data/output")<console>:36: error: value saveAsTextFile is not a member of Array[(Any, Any, Any, Any, Any, Any, Any, Any)]              output.saveAsTextFile("data/output")

and then added following and it failed as well

scala> output.mapValues(_.toList).saveAsTextFile("data/output")<console>:36: error: value mapValues is not a member of Array[(Any, Any, Any, Any, Any, Any, Any, Any)]              output.mapValues(_.toList).saveAsTextFile("data/output")

How can i view in console or in a result file the contents of result or output variables. Missing something basic here.

Update 1

per Shankar Koirala i have removed .collect and then performed save.

scala> val result = mapper.map(x => x.map(x => if(x.isEmpty) 0 else x))

and this is resulting in this output

[Ljava.lang.Object;@7a1167b6[Ljava.lang.Object;@60d86d2f[Ljava.lang.Object;@20e85a55

Update 1.a

Picked up the updated answer and it is giving the correct data

scala> val result = mapper.map(x => x.map(x => if(x.isEmpty) 0 else x).mkString(","))result: org.apache.spark.rdd.RDD[String] = MapPartitionsRDD[29] at map at <console>:31scala> result.saveAsTextFile("data/mkstring")

result

3070811,1963,1096,0,"US","CA",0,1,03022811,1963,1096,0,"US","CA",0,1,563033811,1963,1096,0,"US","CA",0,1,23

Update 2

scala> val output = result.map(x => (x(0),x(1),x(2),x(3), x(4), x(5), x(7), x(8)))output: org.apache.spark.rdd.RDD[(Any, Any, Any, Any, Any, Any, Any, Any)] = MapPartitionsRDD[27] at map at <console>:33scala> output.saveAsTextFile("data/newOutPut")

and i got this result

(3070811,1963,1096,0,"US","CA",1,0)(3022811,1963,1096,0,"US","CA",1,56)(3033811,1963,1096,0,"US","CA",1,23)

Viewing all articles
Browse latest Browse all 38

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>