All Questions
Tagged with google-cloud-dataflow google-cloud-bigtable
90 questions
0
votes
1
answer
778
views
How to stream data from Pub/Sub to Google BigTable using DataFlow?
I want to ask if someone can tell me, or even show me an example, of a dataflow job template, preferably in Python, in which I can:
Continuously read JSON data from a Pub/Sub topic
Process this data /...
1
vote
3
answers
397
views
Google Cloud Dataflow BigQuery to Bigtable Transfer - Throttle Write Speed?
I have a number of Dataflow templates to copy data from BigQuery to Bigtable tables.
The largest of which is about 9 million rows, 22GB worth of data.
There are no complex mutations, it's just a copy.
...
0
votes
0
answers
88
views
Type mistmatch when trying to write to Bigtable using BigtableIO.Write()
I am attempting to write to Bigtable with BigtableIO v2.43.0.
My source data is a PCollection[BigtableRow] where BigtableRow is an internally defined construct.
Following the documentation for ...
0
votes
0
answers
177
views
How to find failed inserts while writing to Bigtable using dataflow jobs?
I am writing a pipeline to migrate data from GCS to Bigtable. Data is in json format. Pipeline works fine but number of records written by dataflow job and Count I get from BigQuery using BigTable as ...
0
votes
1
answer
670
views
How to I locally test my Dataflow Job with a side input that is refreshed periodically?
I have a Dataflow job in Java that reads input messages from a PubSub topic, takes in a side input that is refreshed every hour, combines information from the side input and PubSub Message, and is ...
0
votes
0
answers
282
views
How to manage concurrent put and delete to Bigtable?
I have Apache Beam pipeline that has two branches that will do IO to the same table of Bigtable. One will produce put mutations to Bigtable, second will produce delete mutations.
I can not find ...
1
vote
1
answer
236
views
How to delete record from Bigtable
I have datapipeline in dataflow; and am trying to delete record from bigtable using rowkey.
I tried couple of ways using; ex -
However am unable to successfully delete record, can i get some sample ...
0
votes
1
answer
661
views
getting error while writing data onto cloud bigTable through dataflow
I am using 2nd gen cloud function to trigger dataflow job. Dataflow template is basically reading parquet files from cloud storage and loading data onto bigTable.
Here are the code and package details
...
1
vote
1
answer
182
views
What is the difference between to use org.apache.hadoop.hbase.client Vs com.google.cloud.bigtable.data.v2 on dataflow gcp
There are a difference of perfomance o stability, or long term support maybe ?. I mean is needed migrate hbase api to big table connector apache beam.
0
votes
1
answer
274
views
Migrate BigTable table in PostgreSQL table
I want to move out of BigTable because it is costly. I have one table of 5GB and want to migrate it to a PostgreSQL database. So far I have found ways to migrate data in BigTable but not out of it, ...
-1
votes
1
answer
151
views
How to setup staging, pre-prod for google dataflow jobs?
Say we have a dataflow job:
Written in Apache Beam Java SDK which is a Gradle project.
Uses pubsub stream as input, writes results to bigtable and writes logs to BigQuery.
As with deploying a server,...
1
vote
1
answer
66
views
Sink for user activity data stream to build Online ML model
I am writing a consumer that consumes (user activity data, (activityid, userid, timestamp, cta, duration) from Google Pub/Sub and I want to create a sink for this such that I can train my ML model in ...
0
votes
1
answer
553
views
How to write to BigTable using Apache Beam direct-runner in java?
I have been trying to get Apache Beam direct runner to write to BigTable but it seems like there is a problem.
There is no failure or confirmation errors on the terminal when I run gradle run.
My ...
1
vote
1
answer
314
views
Streaming pubsub -bigtable using apache beam dataflow java
Trying to update the pubsub json message to bigtable .I am running code from local machine .the dataflow job is getting created .but i dont see any data updated in bigtable instance and also it does ...
1
vote
1
answer
1k
views
How to read data from table from another project from different region?
We have a FACT1 table from project1 located in US-Region
& FACT2 table from project2 located in Asia Region.
We want to do a union/join between 2 tables and persist into table in Project2.
But GCP ...