All Questions
Tagged with google-cloud-dataflow google-cloud-pubsub
458 questions
0
votes
1
answer
39
views
Leaving message unacknowledged in Benthos job with gcp_pubsub input
How does Benthos handle the acknowledgement of pubsub messages? How can we manage ack/unack based on custom if-else conditions?
Here is the scenario i'm trying to achieve:
I have written a Benthos job ...
0
votes
0
answers
53
views
how to send batch message to pubsub from apache beam
I have a beam pipeline that make some transformation, and I want to write the data to pubsub and order to be processed by other pipelines.
how to send message to pubsub using apache beam ?
I saw that ...
0
votes
0
answers
41
views
GCP Dataflow brand new one-time window each time for the same key
Is it possible in GCP Dataflow / Apache Beam to create a completely new window with a different start date for the same key in a Streaming Job (on GCP PubSub)? For example, when a message with this ...
1
vote
1
answer
42
views
Beam streaming pipeline not writing windowed files
I'm trying to run this example from Google on my local machine. I'm using PubSub emulator and Beam 2.60.0, executed with --runner=DirectRunner.
...
options.setStreaming(true);
options.setPubsubRootUrl(...
0
votes
0
answers
95
views
Proper windowing to use in Apache Beam / Dataflow merge of two Pub/Sub streams
Background
I had this question, which I have now "solved" with a marginally functional pipeline, but I don't like the solution.
Quick summary, that link shows two Pub/Sub streams that I want ...
0
votes
0
answers
54
views
How to get the Dataflow Job success status using Log Router on dataflow logs
I have set up a log router for a Dataflow step with the following
filter: resource.type="dataflow_step" AND logName="projects/${PROJECT_ID}/logs/dataflow.googleapis.com%2Fworker" ...
0
votes
0
answers
31
views
Read data from PubSub Topic and write those data into BigQuery through the Dataflow by using Python
I successfully created the Dataflow job, but it didn't complete as expected. It seems there might be an issue with reading data from Pub/Sub, which could be preventing the job from finishing correctly....
0
votes
0
answers
103
views
JMS(AMQP) to PubSub
I'm working on setting up a GCP dataflow job that pushes JMS messages to PubSub. I set up the job using the "JMS to PubSub" flex template using my queue (amqp). Apparently the job doesn't ...
1
vote
1
answer
232
views
How can I efficiently insert more than 1 million records into Firestore?
Description:
I am working on a project where I need to insert more than 1 million records into Google Firestore. Currently, my approach is not efficient enough and the process is extremely slow. I am ...
1
vote
0
answers
88
views
Transfer/Stream data/CSV Files from POS (Point of sales) to GCS Buckets and then to Big query
I am working on a project where I have to transfer/Stream the data/CSV file from on prem POS to GCS buckets and then data will be first saved to Big Query external table and then moved to other ...
0
votes
0
answers
115
views
Data Missing in Google Cloud Dataflow Dashboard
Your software uses a simple JSON format for all messages. These messages are published to Google Cloud Pub/Sub, then processed with Google Cloud Dataflow to create a real-time dashboard for the CFO. ...
0
votes
0
answers
73
views
Configure message retention duration of PubSub subscription created by Dataflow
I have a Dataflow pipeline that ingests messages from PubSub. This automatically creates a subscription however the retention duration is 7 days. I like that it creates the subscription so I don't ...
0
votes
1
answer
573
views
python dataflow : GroupByKey cannot be applied to an unbounded PCollection with global windowing and a default trigger
I have a simple python dataflow code that uses unbounded pcollection . It just
Reads from pubsub
Parse into json with output tags SUCCESS and FAILURE
Validate json with output tags SUCCESS and ...
0
votes
1
answer
778
views
How to stream data from Pub/Sub to Google BigTable using DataFlow?
I want to ask if someone can tell me, or even show me an example, of a dataflow job template, preferably in Python, in which I can:
Continuously read JSON data from a Pub/Sub topic
Process this data /...
0
votes
2
answers
172
views
GCP PubSub to DLP Integration via Dataflow
I have a situation here. i want to figure out the best way to ingest API streaming data from an application to GCP BigQuery while having data masking in place. However, some downstream admin users ...