I’m using the official Google Ruby gem and, while so far everything else I’ve tried is working fine (including listing projects, datasets and tables, and creating tables), kicking off a load job is failing with the following inside the JSON error response:
"Job configuration must contain exactly one job-specific configuration object (e.g., query, load, extract, spreadsheetExtract), but there were 0: "
The body string that I created looks like:
--xxx
Content-Type: application/json; charset=UTF-8
{"configuration":{"load":{"destinationTable":{"projectId":"mycompany.com:projectId","datasetId":"all_events","tableId":"install"},"createDisposition":"CREATE_NEVER","writeDisposition":"WRITE_APPEND"}}}
--xxx
Content-Type: application/octet-stream
test,second,1234,6789,83838
--xxx
I’ve previously created the install table with an appropriate schema for that data, so that shouldn’t be the problem.
And finally, just for completeness, here is the actual piece of code I’m using to fire off the request (these are two methods within a larger class):
def create_insert_job
config = {
'configuration' => {
'load' => {
'destinationTable' => {
'projectId' => 'mycompany.com:projectId',
'datasetId' => 'all_events',
'tableId' => 'install'
},
'createDisposition' => 'CREATE_NEVER',
'writeDisposition' => 'WRITE_APPEND'
}
}
}
body = "#{multipart_boundary}\n"
body += "Content-Type: application/json; charset=UTF-8\n"
body += "#{config.to_json}\n"
body += "#{multipart_boundary}\n"
body +="Content-Type: application/octet-stream\n"
body += "test,second,1234,6789,83838\n"
body += "#{multipart_boundary}\n"
prepare_big_query # This simply gets tokens and instantiates google_client and big_query_api
param_hash = { api_method: big_query_api.jobs.insert }
param_hash[:parameters] = {'projectId' =>'mycompany.com:projectId'}
param_hash[:body] = body
param_hash[:headers] = {'Content-Type' => "multipart/related; boundary=#{multipart_boundary}"}
result = google_client.execute(param_hash)
JSON.parse(result.response.body)
end
def multipart_boundary
'--xxx'
end
Any ideas?
ADDITIONS TO THE ANSWER BELOW TO MAKE THIS CODE WORK
Note that the above #multipart_boundary method returns with the ‘–‘ prepended already. This is a problem, as setting the boundary header (in the param hash) will result in ‘–xxx’ when really we want ‘xxx’.
Also, the docs for this gem are pretty rough, because after fixing my newline problem (per @jcondit’s answer) I was getting a new error about uploading to the wrong URL. That’s because you need to do add:
'uploadType' => 'multipart'
to the parameters in order to send the request to the proper URL.
So the final param_hash that worked (again, after fixing the newlines and boundary isses) looks like:
param_hash = { api_method: big_query_api.jobs.insert }
param_hash[:parameters] = {'projectId' => project_id, 'uploadType' => 'multipart'}
param_hash[:body] = body
param_hash[:headers] = {'Content-Type' => "multipart/related; boundary=#{multipart_boundary}"}
You need to insert an extra newline between the headers of each MIME part and the body of each MIME part. The body of your request should look like this:
Note the extra newline after the Content-Type header in each part.
Also, don’t forget that the final boundary separator has a trailing — appended to it.