Spark SQL: SparkSQL is a Spark module for Structured data processing. One use of SparkSQL is to execute SQL queries using a basic SQL syntax. There are several ways to interact with Spark SQL including SQL, the dataframes API,dataset API. The backbone for all these operation is Dataframes and SchemaRDD. DataFrames A dataFrame is a distributed collection of data organised into named columns. It is conceptually equivalent to a table in a relational database. SchemaRDD SchemaRDDs are made of row objects along with the metadata information. Spark SQL needs SQLcontext object,which is created from existing SparkContext. Steps for creating Dataframes,SchemaRDD and performing some operations using the sql methods provided by sqlContext. Step 1: start the spark shell by using the following command....
WordCount in Spark WordCount program is like basic hello world program when it comes to Big data world. Below is program to achieve wordCount in Spark with very few lines of code. [code lang=”scala”]val inputlines = sc.textfile("/users/guest/read.txt") val words = inputlines.flatMap(line=>line.split(" ")) val wMap = words.map(word => (word,1)) val wOutput = wMap.reduceByKey(_ + _) wOutput.saveAsTextFile("/users/guest/")[/code]
Reversal of String in Scala using recursive function – object reverseString extends App { val s = “24Tutorials” print(revs(s)) def revs(s: String): String = { // if (s.isEmpty) “” if (s.length == 1) s else revs(s.tail) + s.head //else revs(s.substring(1)) + s.charAt(0) } } } Output: slairotuT42
Q1) CASE Classes: A case class is a class that may be used with the match/case statement. Case classes can be pattern matched Case classes automatically define hashcode and equals Case classes automatically define getter methods for the constructor arguments. Case classes can be seen as plain and immutable data-holding objects that should exclusively depend on their constructor arguments. Case classes contain a companion object which holds the apply method. This fact makes possible to instantiate a case class without the new keyword. Q2) Pattern Matching Scala has a built-in general pattern matching mechanism. It allows to match on any sort of data with a first-match policy object MatchTest1 extends App { def matchTest(x: Int): String = x match { case 1 => “one” case 2 =>...