SQLContext (Spark 1.4.1 JavaDoc)

Overview

Package

Class

Tree

Deprecated

Index

Help

PREV CLASS NEXT CLASS

FRAMES NO FRAMES

SUMMARY: NESTED | FIELD | CONSTR | METHOD

DETAIL: FIELD | CONSTR | METHOD

org.apache.spark.sql
Class SQLContext

Object
  org.apache.spark.sql.SQLContext

All Implemented Interfaces:: java.io.Serializable, Logging

Direct Known Subclasses:: HiveContext, LocalSQLContext

public class SQLContext
extends Object
implements Logging, scala.Serializable
extends Object
implements Logging, scala.Serializable

The entry point for working with structured data (rows and columns) in Spark. Allows the creation of DataFrame objects as well as the execution of SQL queries.

Since:: 1.0.0
See Also:: Serialized Form

Nested Class Summary
`class`	`SQLContext.implicits$` :: Experimental :: (Scala-specific) Implicit methods available in Scala for converting common Scala objects into `DataFrame`s.

Constructor Summary
`SQLContext(JavaSparkContext sparkContext)`
`SQLContext(SparkContext sparkContext)`

Method Summary

DataFrame applySchema(JavaRDD<?> rdd, Class<?> beanClass)

DataFrame applySchema(JavaRDD<Row> rowRDD, StructType schema)

DataFrame applySchema(RDD<?> rdd, Class<?> beanClass)

DataFrame applySchema(RDD<Row> rowRDD, StructType schema)

DataFrame baseRelationToDataFrame(BaseRelation baseRelation)

void cacheTable(String tableName)
Caches the specified table in-memory.

void clearCache()
Removes all cached tables from the in-memory cache.

DataFrame createDataFrame(JavaRDD<?> rdd, Class<?> beanClass)

DataFrame createDataFrame(JavaRDD<Row> rowRDD, StructType schema)

DataFrame createDataFrame(RDD<?> rdd, Class<?> beanClass)





<A extends scala.Product> 


DataFrame

createDataFrame(RDD<A> rdd,
                scala.reflect.api.TypeTags.TypeTag<A> evidence$3)

DataFrame createDataFrame(RDD<Row> rowRDD, StructType schema)





<A extends scala.Product> 


DataFrame

createDataFrame(scala.collection.Seq<A> data,
                scala.reflect.api.TypeTags.TypeTag<A> evidence$4)

DataFrame createExternalTable(String tableName, String path)

DataFrame createExternalTable(String tableName, String source, java.util.Map<String,String> options)

DataFrame createExternalTable(String tableName, String source, scala.collection.immutable.Map<String,String> options)

DataFrame createExternalTable(String tableName, String path, String source)

DataFrame createExternalTable(String tableName, String source, StructType schema, java.util.Map<String,String> options)

DataFrame createExternalTable(String tableName, String source, StructType schema, scala.collection.immutable.Map<String,String> options)

void dropTempTable(String tableName)

DataFrame emptyDataFrame()
:: Experimental :: Returns a DataFrame with no rows or columns.

ExperimentalMethods experimental()
:: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.

scala.collection.immutable.Map<String,String> getAllConfs()
Return all the configuration properties that have been set (i.e.

String getConf(String key)
Return the value of Spark SQL configuration property for the given key.

String getConf(String key, String defaultValue)
Return the value of Spark SQL configuration property for the given key.

static SQLContext getOrCreate(SparkContext sparkContext)
Get the singleton SQLContext if it exists or create a new one using the given SparkContext.

SQLContext.implicits$ implicits()
Accessor for nested Scala object

boolean isCached(String tableName)
Returns true if the table is currently cached in-memory.

DataFrame jdbc(String url, String table)
Deprecated. As of 1.4.0, replaced by read().jdbc().

DataFrame jdbc(String url, String table, String[] theParts)
Deprecated. As of 1.4.0, replaced by read().jdbc().

DataFrame jdbc(String url, String table, String columnName, long lowerBound, long upperBound, int numPartitions)
Deprecated. As of 1.4.0, replaced by read().jdbc().

DataFrame jsonFile(String path)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonFile(String path, double samplingRatio)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonFile(String path, StructType schema)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(JavaRDD<String> json)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(JavaRDD<String> json, double samplingRatio)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(JavaRDD<String> json, StructType schema)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(RDD<String> json)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(RDD<String> json, double samplingRatio)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame jsonRDD(RDD<String> json, StructType schema)
Deprecated. As of 1.4.0, replaced by read().json().

DataFrame load(String path)
Deprecated. As of 1.4.0, replaced by read().load(path).

DataFrame load(String source, java.util.Map<String,String> options)
Deprecated. As of 1.4.0, replaced by read().format(source).options(options).load().

DataFrame load(String source, scala.collection.immutable.Map<String,String> options)
Deprecated. As of 1.4.0, replaced by read().format(source).options(options).load().

DataFrame load(String path, String source)
Deprecated. As of 1.4.0, replaced by read().format(source).load(path).

DataFrame load(String source, StructType schema, java.util.Map<String,String> options)
Deprecated. As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().

DataFrame load(String source, StructType schema, scala.collection.immutable.Map<String,String> options)
Deprecated. As of 1.4.0, replaced by read().format(source).schema(schema).options(options).load().

DataFrame parquetFile(scala.collection.Seq<String> paths)

DataFrame parquetFile(String... paths)
Deprecated. As of 1.4.0, replaced by read().parquet().

DataFrame range(long end)

DataFrame range(long start, long end)

DataFrame range(long start, long end, long step, int numPartitions)

DataFrameReader read()

void setConf(java.util.Properties props)
Set Spark SQL configuration properties.

void setConf(String key, String value)
Set the given Spark SQL configuration property.

SparkContext sparkContext()

DataFrame sql(String sqlText)

DataFrame table(String tableName)

String[] tableNames()

String[] tableNames(String databaseName)

DataFrame tables()

DataFrame tables(String databaseName)

UDFRegistration udf()
A collection of methods for registering user-defined functions (UDF).

void uncacheTable(String tableName)
Removes the specified table from the in-memory cache.

Methods inherited from class Object
`equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait`

Methods inherited from interface org.apache.spark.Logging
`initializeIfNecessary, initializeLogging, isTraceEnabled, log_, log, logDebug, logDebug, logError, logError, logInfo, logInfo, logName, logTrace, logTrace, logWarning, logWarning`

Constructor Detail

SQLContext

public SQLContext(SparkContext sparkContext)

SQLContext

public SQLContext(JavaSparkContext sparkContext)

Method Detail

getOrCreate

public static SQLContext getOrCreate(SparkContext sparkContext)

Get the singleton SQLContext if it exists or create a new one using the given SparkContext. This function can be used to create a singleton SQLContext object that can be shared across the JVM.

Parameters:: sparkContext - (undocumented)
Returns:: (undocumented)

parquetFile

public DataFrame parquetFile(String... paths)

Deprecated. As of 1.4.0, replaced by read().parquet().

Loads a Parquet file, returning the result as a DataFrame. This function returns an empty DataFrame if no paths are passed in.

Parameters:: paths - (undocumented)
Returns:: (undocumented)

sparkContext

public SparkContext sparkContext()

setConf

public void setConf(java.util.Properties props)

Set Spark SQL configuration properties.

Parameters:: props - (undocumented)
Since:: 1.0.0

setConf

public void setConf(String key,
                    String value)

Set the given Spark SQL configuration property.

Parameters:: key - (undocumented); value - (undocumented)
Since:: 1.0.0

getConf

public String getConf(String key)

Return the value of Spark SQL configuration property for the given key.

Parameters:: key - (undocumented)
Returns:: (undocumented)
Since:: 1.0.0

getConf

public String getConf(String key,
                      String defaultValue)

Return the value of Spark SQL configuration property for the given key. If the key is not set yet, return defaultValue.

Parameters:: key - (undocumented); defaultValue - (undocumented)
Returns:: (undocumented)
Since:: 1.0.0

getAllConfs

public scala.collection.immutable.Map<String,String> getAllConfs()

Return all the configuration properties that have been set (i.e. not the default). This creates a new copy of the config properties in the form of a Map.

Returns:: (undocumented)
Since:: 1.0.0

experimental

public ExperimentalMethods experimental()

:: Experimental :: A collection of methods that are considered experimental, but can be used to hook into the query planner for advanced functionality.

Returns:: (undocumented)
Since:: 1.3.0

emptyDataFrame

public DataFrame emptyDataFrame()

:: Experimental :: Returns a DataFrame with no rows or columns.

Returns:: (undocumented)
Since:: 1.3.0

udf

public UDFRegistration udf()

A collection of methods for registering user-defined functions (UDF).

The following example registers a Scala closure as UDF:


   sqlContext.udf.register("myUdf", (arg1: Int, arg2: String) => arg2 + arg1)

The following example registers a UDF in Java:


   sqlContext.udf().register("myUDF",
       new UDF2<Integer, String, String>() {
           @Override
           public String call(Integer arg1, String arg2) {
               return arg2 + arg1;
           }
      }, DataTypes.StringType);

Or, to use Java 8 lambda syntax:


   sqlContext.udf().register("myUDF",
       (Integer arg1, String arg2) -> arg2 + arg1,
       DataTypes.StringType);

Returns:: (undocumented)
Since:: 1.3.0 TODO move to SQLSession?

isCached

public boolean isCached(String tableName)

Returns true if the table is currently cached in-memory.

Parameters:: tableName - (undocumented)
Returns:: (undocumented)
Since:: 1.3.0

cacheTable

public void cacheTable(String tableName)

Caches the specified table in-memory.

Parameters:: tableName - (undocumented)
Since:: 1.3.0

uncacheTable

public void uncacheTable(String tableName)

Removes the specified table from the in-memory cache.

Parameters:: tableName - (undocumented)
Since:: 1.3.0

clearCache

public void clearCache()

Removes all cached tables from the in-memory cache.

Since:: 1.3.0

implicits

public SQLContext.implicits$ implicits()

Accessor for nested Scala object

Returns:: (undocumented)

createDataFrame

public <A extends scala.Product> DataFrame createDataFrame(RDD<A> rdd,
                                                           scala.reflect.api.TypeTags.TypeTag<A> evidence$3)

createDataFrame

public <A extends scala.Product> DataFrame createDataFrame(scala.collection.Seq<A> data,
                                                           scala.reflect.api.TypeTags.TypeTag<A> evidence$4)

baseRelationToDataFrame

public DataFrame baseRelationToDataFrame(BaseRelation baseRelation)

createDataFrame

public DataFrame createDataFrame(RDD<Row> rowRDD,
                                 StructType schema)

createDataFrame

public DataFrame createDataFrame(JavaRDD<Row> rowRDD,
                                 StructType schema)

createDataFrame

public DataFrame createDataFrame(RDD<?> rdd,
                                 Class<?> beanClass)

createDataFrame

public DataFrame createDataFrame(JavaRDD<?> rdd,
                                 Class<?> beanClass)

read

public DataFrameReader read()

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String path)

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String path,
                                     String source)

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String source,
                                     java.util.Map<String,String> options)

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String source,
                                     scala.collection.immutable.Map<String,String> options)

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String source,
                                     StructType schema,
                                     java.util.Map<String,String> options)

createExternalTable

public DataFrame createExternalTable(String tableName,
                                     String source,
                                     StructType schema,
                                     scala.collection.immutable.Map<String,String> options)

dropTempTable

public void dropTempTable(String tableName)

range

public DataFrame range(long start,
                       long end)

range

public DataFrame range(long end)

range

public DataFrame range(long start,
                       long end,
                       long step,
                       int numPartitions)

sql

public DataFrame sql(String sqlText)

table

public DataFrame table(String tableName)

tables

public DataFrame tables()

tables

public DataFrame tables(String databaseName)

tableNames

public String[] tableNames()

tableNames

public String[] tableNames(String databaseName)

applySchema

public DataFrame applySchema(RDD<Row> rowRDD,
                             StructType schema)

applySchema

public DataFrame applySchema(JavaRDD<Row> rowRDD,
                             StructType schema)

applySchema

public DataFrame applySchema(RDD<?> rdd,
                             Class<?> beanClass)

applySchema

public DataFrame applySchema(JavaRDD<?> rdd,
                             Class<?> beanClass)