VoiceML requests¶
When Cloud Voice needs instructions for a call, it sends a VoiceML request to one of the handler URIs configured for that call. The request is an HTTP POST with a JSON body. Your server replies with a VoiceML XML document, and Cloud Voice executes the actions in that document on the call. When an action that has a NextUri finishes, Cloud Voice sends the next VoiceML request, and the cycle repeats until the call ends.
Explaining the difference between VoiceML requests and webhooks
VoiceML requests are synchronous request/response: your server must reply with a VoiceML XML document, and Cloud Voice acts on that response. They are sent to the call's handler URIs — for incoming calls, the URI configured on the phone number; for outgoing calls, the handlerUris field of Call.options.
Voice Webhooks are fire-and-forget event notifications — you return any 2xx status code, and the response body is ignored. They are sent for events such as CALL_STATE_UPDATE, FAX_READY, RECORDING_READY, and TRANSCRIPTION_READY.
Request¶
Cloud Voice sends a VoiceML request as:
- Method:
POST - Headers:
Content-Type: application/json; charset=utf-8X-Enfonica-Event: CALLX-Enfonica-Signature: HMAC-SHA256 signature for authenticity verification. The signing procedure is the same as for voice webhooks.
- Body: a JSON-serialized
enfonica.voice.v1beta1.CallRequest.
Request body¶
The body has two fields:
| Field | Description |
|---|---|
call |
A JSON-serialized enfonica.voice.v1beta1.Call representing the call at the time of the request. The state field reflects the call's state at the moment Cloud Voice issued the request — for example, STARTING for an incoming call that has not yet been answered, or IN_PROGRESS after an <Input> collected digits on an answered call. |
parameters |
A string-to-string map of return values from the action that most recently completed. Empty (or absent) on the first VoiceML request of a call. On subsequent requests, contains the parameters documented on the action that produced them (see Return parameters below). |
Return parameters¶
The following actions populate parameters on the next VoiceML request:
<Call>—action,callStatus,callEndpoint<Input>—action,digits<Record>—action,recording,estimatedDurationSeconds<Stream>— any parameters set viaupdate_request_parameters
Other actions (such as <Say>, <Play>, <Wait>) do not populate parameters.
Example: first request¶
The first VoiceML request for an incoming PSTN call. The call has not been answered yet, so state is STARTING and parameters is empty.
{
"call": {
"name": "projects/my-project/calls/abc123def456",
"to": "+61399998888",
"from": "+61400111222",
"isPrivate": false,
"transport": "PSTN",
"direction": "INCOMING",
"state": "STARTING",
"createTime": "2026-05-15T14:23:01.234Z",
"startTime": "2026-05-15T14:23:01.234Z",
"createMethod": "INCOMING_CALL",
"fromLocation": {
"regionCode": "AU",
"administrativeArea": "Victoria",
"locality": "Melbourne"
},
"fromZone": "Melbourne"
},
"parameters": {}
}
Example: follow-up request after <Input>¶
The same call after your application returned a VoiceML response containing <Input>, and the caller pressed 1#. The call is now answered, so state is IN_PROGRESS and answerTime is set. The parameters map carries the values documented on <Input>.
{
"call": {
"name": "projects/my-project/calls/abc123def456",
"to": "+61399998888",
"from": "+61400111222",
"isPrivate": false,
"transport": "PSTN",
"direction": "INCOMING",
"state": "IN_PROGRESS",
"createTime": "2026-05-15T14:23:01.234Z",
"startTime": "2026-05-15T14:23:01.234Z",
"answerTime": "2026-05-15T14:23:03.456Z",
"createMethod": "INCOMING_CALL",
"fromLocation": {
"regionCode": "AU",
"administrativeArea": "Victoria",
"locality": "Melbourne"
},
"fromZone": "Melbourne"
},
"parameters": {
"action": "INPUT",
"digits": "1"
}
}
Response¶
Your server replies with:
- Status:
200 OK. Any 2xx status is accepted. Redirects (3xx) are followed. - Header:
Content-Typeindicating XML, for exampleapplication/xml. - Body: a VoiceML document — a
<Response>element wrapping zero or more VoiceML actions.
Example response¶
<?xml version="1.0" encoding="UTF-8"?>
<Response>
<Input MaxDigitCount="1" NextUri="/menu">
<Say>Press 1 for sales, 2 for support, or 3 for everything else.</Say>
</Input>
</Response>
Failure handling¶
A non-2xx response, a network failure, or a body that cannot be parsed as VoiceML is treated as a failure. Cloud Voice will try the next handler URI (if multiple are configured) and then retry. The retry schedule is the same as for voice webhooks — see Webhooks → Response.
Lifecycle¶
A typical call goes through this loop:
- An event triggers the first request — an incoming call arrives, or an outgoing call created via the API reaches the point where it needs instructions.
- Cloud Voice POSTs a
CallRequestto the first handler URI.parametersis empty. - Your server replies
200 OKwith a<Response>document. - Cloud Voice executes the actions in order.
- When an action with a
NextUrifinishes, Cloud Voice POSTs the nextCallRequest.parameterscontains the return values from the action that just completed. - If execution reaches the end of the document without a
NextUrior other continuation, there may be no furtherCallRequest; the call flow ends after the final action completes.
Cookies¶
VoiceML requests maintain cookies on the controlling call. You can use cookies to easily maintain state in a conversation by either setting cookies directly, or using the in-built session state in your programming language or framework of choice.