Newest 'jdbc apache-spark' Questions

0 votes

0 answers

79 views

Spark JDBC Connection To MsSQL Using Kerberos - Failed to find any Kerberos tgt

While trying to connect Spark with MSSQL, we are setting up a JDBC connection and want to Kerberize it. Using the keytab and principal we created, we were able to establish a connection with a simple ...

Baki Erbaş

21

asked Mar 11 at 13:28

1 vote

1 answer

66 views

Disable inferSchema for JDBC connections

I have an Azure SQL database that I want to query with PySpark. I have to "copy" the data to a temporary table, and then query this temporary table. I would like to use pretty much the same ...

ralpar

11

asked Mar 2 at 6:49

1 vote

1 answer

87 views

Paintully slow Spark Oracle read (JDBC)

I am reading a small table from an Oracle database using Spark on Databricks. The code is very simple: df = spark.read.jdbc(url = url, table = table_name, properties = "{"driver": "...

Łukasz Kastelik

659

asked Jan 2 at 10:45

1 vote

1 answer

110 views

“Spark-PySpark Redshift JDBC Write: No suitable driver / ClassNotFoundException: com.amazon.redshift.jdbc42.Driver Errors”

I’m trying to write a DataFrame from Spark (PySpark) to an Amazon Redshift Serverless cluster using the Redshift JDBC driver. I keep running into driver-related errors: • java.sql.SQLException: No ...

Cauder

2,657

asked Dec 11, 2024 at 6:02

0 votes

1 answer

58 views

Writing data to ADW through JDBC in a PySpark environment performs poorly

I am trying to write PySpark DataFrames to ADW (Oracle Autonomous Data Warehouse) using JDBC in a Jupyter Lab environment, but the performance is low. dataframe.format("jdbc").mode('...

danmo41

3

asked Nov 21, 2024 at 20:54

0 votes

2 answers

128 views

Breaking up a large JDBC write with Spark

We want to copy a large Spark dataframe into Oracle, but I am finding the tuning options a bit limited. Looking at Spark documentation, the only related tuning property I could find for a JDBC write ...

Depressio

1,379

asked Nov 18, 2024 at 21:03

1 vote

1 answer

102 views

Spark JDBC read results in row loss

We're currently running into an issue with a spark job architecture which is used as an interface to import data sources from a corporate oracle data warehouse into Amazon s3. The job contains no ...

ddluke

51

asked Nov 7, 2024 at 7:16

1 vote

2 answers

139 views

Unable to get the postgres data in the right format via Kafka, JDBC source connector and pyspark

I have created a table in Postgres: CREATE TABLE IF NOT EXISTS public.sample_a ( id text COLLATE pg_catalog."default" NOT NULL, is_active boolean NOT NULL, is_deleted boolean NOT ...

RushHour

613

asked Oct 24, 2024 at 11:06

0 votes

2 answers

90 views

Reading from Apache Ignite with JDBC driver gives SQLException: Fetch size must be greater than zero

I'm trying to read some data from an Apache Ignite table with PySpark. spark.read.format("jdbc").option("driver", "org.apache.ignite.IgniteJdbcThinDriver")\ .option("...

Felix

3,641

asked Oct 16, 2024 at 19:05

0 votes

0 answers

40 views

How to save a PySpark dataframe to Apache Ignite with JDBC driver?

I have a PySpark application and I want to save some dataframe to Apache Ignite database (to a new table). I have created a Spark session with Spark JDBC driver plugged and I'm trying to save a ...

Felix

3,641

asked Oct 3, 2024 at 13:45

0 votes

0 answers

89 views

Py4JJavaError : The TCP/IP connection to the host localhost, port 1433 has failed

first time learning pyspark, want to learn Read and Write from/to Sql Server Via JDBC using this code from pyspark.sql import SparkSession host = "localhost" database = "...

abbym

31

asked Sep 26, 2024 at 16:07

-1 votes

1 answer

67 views

running normal java non-spark application in spark cluster

I want to run/execute a normal java application which connects to teradata database. I would like to run this java app in spark cluster although my java app is non-spark. Questions are as follows Is ...

ironfreak

37

asked Aug 29, 2024 at 1:34

0 votes

1 answer

128 views

Spark Driver runs out of memory loading a query from a JDBC driver

Here is the issue. I need to load data from a remote database, with a bit of an awkward query: query = f""" (SELECT id, key, MAX(time) AS created_at FROM remote.table WHERE ...

Lennart Reus

1

asked Jul 24, 2024 at 1:16

0 votes

0 answers

106 views

EMR Serverless SparkSession builder error: ClassNotFoundException issues

I am trying to create a job in EMR Studio to run in an EMR Serverless application. It's a relatively basic script to use PySpark to read some Athena tables, do some joins, create an output dataframe ...

si1287

1

asked Jul 23, 2024 at 11:22

0 votes

3 answers

216 views

How to connect Snowflake with PySpark with Google Colab?

I am trying to connect to Snowflake with Pyspark on Google Colab. Spark version 3.4 Scala version 2.12.17 from pyspark.sql import SparkSession from pyspark.sql.functions import * from pyspark import ...

Ryan Seiyu

23

asked Jul 3, 2024 at 5:50

Collectives™ on Stack Overflow

All Questions

Spark JDBC Connection To MsSQL Using Kerberos - Failed to find any Kerberos tgt

Disable inferSchema for JDBC connections

Paintully slow Spark Oracle read (JDBC)

“Spark-PySpark Redshift JDBC Write: No suitable driver / ClassNotFoundException: com.amazon.redshift.jdbc42.Driver Errors”

Writing data to ADW through JDBC in a PySpark environment performs poorly

Breaking up a large JDBC write with Spark

Spark JDBC read results in row loss

Unable to get the postgres data in the right format via Kafka, JDBC source connector and pyspark

Reading from Apache Ignite with JDBC driver gives SQLException: Fetch size must be greater than zero

How to save a PySpark dataframe to Apache Ignite with JDBC driver?

Py4JJavaError : The TCP/IP connection to the host localhost, port 1433 has failed

running normal java non-spark application in spark cluster

Spark Driver runs out of memory loading a query from a JDBC driver

EMR Serverless SparkSession builder error: ClassNotFoundException issues

How to connect Snowflake with PySpark with Google Colab?

Hot Network Questions

Collectives™ on Stack Overflow

All Questions

Related Tags